CS691: Big Data Systems

 

Fall 2012

     

Home

Schedule

Resources

 

  CS691: Big Data Systems

Lectures:
M 3:30-5:30pm (**NOTE: Changed from W 12-2pm previously**) in CMPSCI 142
Instructor:
V. Arun (office hours by appointment)
Mailing List:  cs691bd at cs.umass.edu (If you officially registered, you should be automatically on this list. If not, you can subscribe by sending a mail to majordomo@cs.umass.edu with the following text: subscribe cs691bd <your-email-address>


Goals

This seminar course will cover recent advances in big data systems, an increasingly important and challenging area that deals with the processing and analysis of extremely large amounts of data. We will present and discuss recent research papers in big data systems coupled with a project involving one of several real-life large-scale datasets crawled from a variety of online sources. The projects will enable students to familiarize themselves with tools and platforms such as Hadoop, Amazon AWS, etc. as well as adaptation of data mining algorithms so as to meet the challenges of data analysis at scale. The class may optionally be taken for one credit without the project option, which will involve just paper presentations.



Course Materials

Much of the course material will be based on research papers and class notes posted on the Schedule page. For projects and background reading, the online textbook Mining Massive Datasets will be useful. The course's emphasis will be mainly on systems papers, but the projects will involve design and implementation of large-scale data mining techniques.


Prerequisites

An undergraduate-level knowledge of operating systems, networking, databases, and algorithms.