- Email: cmusco at cs dot umass dot edu.
- Office: CS 234
- Office Hours: Tuesday 2:30pm-3:30pm (directly after class) in CS 234.
- How to Contact: If you need to chat or schedule an individual meeting, you can reach out over email, via a Piazza message, or in person, after class or during office hours.

- Foundations of Data Science, Avrim Blum, John Hopcroft and Ravi Kannan.
- Mining of Massive Datasets, Jure Leskovec, Anand Rajaraman and Jeff Ullman.
- Probability and Computing, Michael Mitzenmacher and Eli Upfal.

- The Modern Algorithmic Toolbox, Gregory Valiant at Stanford.
- Sketching Algorithms for Big Data, Piotr Indyk and Jelani Nelson at MIT/Harvard.
- Algorithmic Techniques for Big Data, Moses Charikar at Stanford.
- COMPSCI 514 last year (Fall 2021).

- Problem Sets (5 total): 40%, split equally between core competency problems and challenge problems, see details below.
- Weekly Quizzes: 10%, weighted equally, lowest score dropped.
- Midterm: 25%.
- Final: 25%.

- Core competency questions are designed to help you master the key algorithmic and mathematical tools introduced in the course. They will be similar in difficulty to exam questions, and if you are able to solve them, you should be well prepared for the in-class exams.
- You are expected to complete all core competency questions. They will be graded numerically, and count for 20% of the final grade (equally weighted accross problem sets).
- Challenge questions are designed to strengthen your ability to think creatively about algorithmic problems and push beyond known approaches, to develop solutions of your own. They will require significantly more time to digest and solve than core competency questions.
- Each problem set will contain roughly three challenge questions. You can choose which ones you wish to complete., and may attempt as many as you like. In total, the challenge questions will count for 20% of your final grade.
- Each challenge question will be graded on a scale of X, ✓-,✓,✓+. These marks will count towards your grade as follows: each ✓- is worth 1 point, and each ✓ is worth 2 points, each ✓+ is worth 3 points. An X is worth 0 points.
**Full credit is obtained by scoring 15 points total throughout the semester.**Partial credit is assigned accordingly (e.g., if you score 12 points total throughout the semeseter, you will receive an 80% on this component of the course.) - The rubric for challenge question grading is as follows:
- ✓+: Submitted work is fully correct and clearly presented. It could be used as a reference solution for the problem. Any errors are minor and easily correctible.
- ✓: Submitted work demonstrates a full understanding of the problem. There may be some errors, omissions, or unclear steps, but overall, a reader would be able to understand how to solve the problem by looking at the submitted work.
- ✓-: Submitted work demonstrates partial understanding of the concepts, but contains significant omissions or errors.
- X: Submitted work doesn't not provide enough information to determine whether there is understanding of the problem.

- While we encourage working in groups, we expect all members of a group to collaborate on and understand their submitted solutions. Some exam problems may closely resemble previous homework problems, and so understanding their solutions will be critical to your success in the course.
- Problem set submissions will be via Gradescope. If working in a group, only one member of each group should submit the problem set, marking the other members in the group as part of the submission in Gradescope.
- The entry code for Gradescope is
`WB38GP`

. **Core competency problems and challenge problems will be submitted separately in Gradescope -- you will see a separate Gradescope assignment for each component.**You do not neccesarily need to submit with the same group for the different questions types or different problem sets.- No late homework submissions will be accepted unless there are extenuating circumstances, approved by the instructor before the deadline.
- I strongly encourage students to type up problem sets using Latex. A Latex template for problem sets can be downloaded here. While it may seem cumbersome at first, getting proficient in Latex will save you a lot of time in the long run!

- Asking good clarfiying questions and answering questions during lecture.
- Actively participating in office hours.
- Asking good clarfiying questions and answering other students' or instructor questions on Piazza.
- Posting helpful links on Piazza, e.g., resources that cover class material, research articles related to the topics covered in class, etc.

I understand that people have different learning needs, home situations, etc. If something isn’t working for you in the class, please reach out and let’s try to work it out.

- Students will learn about modern tools for data processing, including random sampling and hashing, low-memory streaming algorithms, linear and non-linear dimensionality reduction, spectral graph theory, and continuous optimization. A major goal is to be familiar at a high level with a breadth of algorithmic tools beyond combinatorial algorithms, which are the main focus of most undergraduate algorithms courses.
- Through problem sets, students will develop the ability to apply and modify these algorithmic tools to tackle new problems, beyond those discussed in class. They will strengthen their ability to think creatively about algorithmic problems and push beyond known approaches, to develop solutions of their own.
- Through assessments that emphasize formal proofs, students will strengthen their ability to formulate problems mathematically and analyze them rigorously.
- Through algorithmic problems, students will practice applying fundamental tools in probability theory and linear algebra, which are broadly applicable in data science and machine learning. These include concentration bounds and methods for decomposing complex random variables, eigendecomposition, orthogonal projection, important matrix identities, and fundamentals of high-dimensional geometry and random matrix theory.