COMPSCI 514: Algorithms for Data Science (Fall 2020)
Time: Tue/Thurs 1:00pm-2:15pm
Location:
Lecture Zoom. See announcements for password. If you cannot make the live lecture, all lectures will be recorded and posted under the
Schedule tab.
Professor:
Cameron Musco
- Email: cmusco at cs dot umass dot edu.
- Office: CS 234
- Office Hours: Tuesday 8am - 9am and 2:15pm-3:15pm at Office Hours Zoom. See announcements for password.
Teaching Assistants:
- Pratheba Selvaraju
- Email: pselvaraju at cs dot umass dot edu
- Office Hours: Monday 2pm-3pm, Wednesday 1-2pm, Friday 11am-12pm at Pratheba Office Hours Zoom. Zoom password is the same as for lecture.
- Shiv Shankar
- Email: sshankar at cs dot umass dot edu
- Office Hours: Wednesday 9am-10am at Shiv Office Hours Zoom. Zoom password is the same as for lecture.
Course Description:
With the advent of social networks, ubiquitous sensors, and large-scale computational science, data scientists must deal with data that is massive in size,
arrives at blinding speeds, and often must be processed within interactive or quasi-interactive time frames. This course studies the mathematical foundations
of big data processing, developing algorithms and learning how to analyze them. We explore methods for sampling, sketching, and distributed processing of
large scale databases, graphs, and data streams for purposes of scalable statistical description, querying, pattern mining, and learning. Course was
previously COMPSCI 590D. 3 credits.
Prerequisites:
The undergraduate prerequisites are COMPSCI 240 (Probability) and COMPSCI 311 (Algorithms). This is a theoretical course with an emphasis on algorithm design, correctness proofs, and analysis. Aside from a general background in algorithms, a strong mathematical background, particularly in linear algebra and probability is required. If you are a masters student with a limited background in either of these subjects, please email me at the start of the semester.
Textbooks: This is no official textbook for this class. We will use some material from:
Related Classes: You may also find some helpful reference material in these similar classes taught at other universities:
Piazza: We will use Piazza for class discussion, questions, and annoucements. Sign up
here. We hope for Piazza
to be one of the main interactive components of the class. Thus, we encourage posting and good answering of other students' questions as part of up to 5% extra credit for class participation (see below).
Grading:
- Problem Sets (6 total): 40%, weighted equally.
- Weekly Quizzes: 10%, weighted equally.
- Midterm: 25%.
- Final: 25%.
Problem Sets: Problem sets can be completed in groups of up to three students. If you work in a group, you submit a single problem set together. You may talk to people not in your group about the problem sets at a high level, but may not work through the detailed solutions together, write them up together, etc. We very strongly encourage you to work in a three person group, as it will give an advantage in doing the problem sets. At the beginning of the semester we will make a Piazza post where you can look for teammates. We will also have random breakouts during lecture so that you can get to know some of your classmates.
- Problem set submissions will be via Gradescope. If working in a group, only one member of each group should submit the problem set, marking the other members in the group as part of the submission in Gradescope.
- The entry code for Gradescope is
9DV6G5
. Please sign up and complete the Gradescope consent poll in Piazza by 9/3.
- No late homework submissions will be accepted unless there are extenuating circumstances, approved by the instructor before the deadline.
- I strongly encourage students to type up problem sets using either Latex or Markdown. A Latex template for problem sets can be downloaded here. For editing Markdown, I use Typora, which supports Latex-style math equations (see here). While they may seem cumbersome at first, these tools will save you a lot of time in the long run!
Weekly Quizzes: A quiz will be posted on Piazza each Thursday after class, due the following Monday at 8pm. These are very short quizzes (designed to take ~15 minutes) to check that you are following the material and help me make adjustments if needed. Quizzes will include check-in questions asking for feedback on class pacing and on topics that need clarification, or that you would like to see discussed more.
Exams: The midterm and final will be take home, open note exams. For each, you will have a 1.5 hour window to complete the exam during a 48 hour period. The midterm will be October 8th-9th. The final will be December 1st-2nd.
Class Participation: Up to 5% extra credit may be awarded for class participation. This may come in many forms, e.g.:
- Asking good clarfiying questions and answering questions during the live lecture.
- Actively participating in office hours.
- Asking good clarfiying questions and answering other students' or instructor questions on Piazza.
- Posting helpful links on Piazza, e.g., resources that cover class material, research articles related to the topics covered in class, etc.
Academic Honestly: If caught violating the problem set or quiz rules, students will receive a 0% on the assignment for the first violation, and fail the class for a second violation. Any cheating on the midterm or final will lead to failing the class. For fairness, we apply these rules universally, without exceptions.
Disability Services: UMass Amherst is committed to making reasonable, effective, and appropriate accommodations to meet the needs to students with disabilities and help create a barrier-free campus. If you have a documented disability on file with
Disability Services, you may be eligible for reasonable accommodations in this course. If your disability requires an accommodation, please notify me within the first two weeks of the course so that we may make arrangements in a timely manner.