COMPSCI 514: Algorithms for Data Science

Lecture recordings from Echo360 can be accessed here.

Lecture	Day	Topic	Materials/Reading
1. 9/2	Thu	Course overview. Probability review.	Slides. Compressed slides. MIT short videos and exercises on probability. Khan academy probability lessons (a bit more basic).
Randomized Methods, Sketching & Streaming
2. 9/7	Tue	Estimating set size by counting duplicates. Concentration Bounds: Markov's inequality. Random hashing for efficient lookup.	Slides. Compressed slides.
3. 9/9	Thu	2-level hashing. 2-universal and pairwise independent hashing. Hashing for load balancing.	Slides. Compressed slides. Some notes (Arora and Kothari at Princeton) proving that the ax+b mod p hash function described in class in 2-universal.
4. 9/14	Tue	Finish up hashing for load balancing. Chebyshev's inequality. The union bound.	Slides. Compressed slides.
5. 9/16	Thu	Exponential concentration bounds and the central limit theorem.	Slides. Compressed slides.
6. 9/21	Tue	Bloom Filters.	Slides. Compressed slides. Reading: Chapter 4 of Mining of Massive Datasets, with content on bloom filters and distinct elements counting. See here for full Bloom filter analysis. See here for some explaination of why a version of a Bloom filter with no false negatives cannot be achieved without using a lot of space. See Wikipedia for a discussion of the many bloom filter variants, including counting Bloom filters, and Bloom filters with deletions. See Wikipedia again and these notes for an explaination of Cuckoo Hashing, a randomized hash table scheme which, like 2-level hashing, has O(1) query time, but also has expected O(1) insertion time.
7. 9/23	Thu	Min-Hashing for Distinct Elements. The median trick.	Slides. Compressed slides. Reading: Chapter 4 of Mining of Massive Datasets, with content on distinct elements counting.
8. 9/28	Tue	Distinct elements in pratice: Flajolet-Martin and HyperLogLog. Start on Jaccard similarity and near neighbor search.	Slides. Compressed slides. Reading: Chapter 3 of Mining of Massive Datasets, with content on Jaccard similarity, MinHash, and locality sensitive hashing. The 2007 paper introducing the popular HyperLogLog distinct elements algorithm.
9. 9/30	Thu	Jaccard similarity estimation with MinHash. Locality sensitive hashing for fast similarity search.	Slides. Compressed slides. Reading: Chapter 3 of Mining of Massive Datasets, with content on Jaccard similarity, MinHash, and locality sensitive hashing.
10. 10/5	Tue	The frequent elements problem and count-min sketch.	Slides. Compressed slides. Reading: Notes (Amit Chakrabarti at Dartmouth) on streaming algorithms. See Chapters 2 and 4 for frequent elements. Some more notes on the frequent elements problem. A website with lots of resources, implementations, and example applications of count-min sketch.
11. 10/7	Thu	Dimensionality reduction, low-distortion embeddings, and the Johnson Lindenstrauss Lemma.	Slides. Compressed slides. Reading: Chapter 2.7 of Foundations of Data Science on the Johnson-Lindenstrauss lemma. Notes on the JL-Lemma (Anupam Gupta CMU). Sparse random projections which can be multiplied by more quickly. Linear Algebra Review: Khan academy.
12. 10/12	Tue	Finish up Johnson-Lindenstrauss lemma proof. High-dimensional geometry and its connection to the JL Lemma.	Slides. Compressed slides. Reading: Chapters 2.3-2.6 of Foundations of Data Science on high-dimensional geometry.
13. 10/14	Thu	Finish up connections between JL and high-dimensional geometry. Midterm Review.	Slides.
10/19	Tue	Midterm	Study guide and review questions.
Spectral Methods
14. 10/21	Thu	Intro to principal component analysis, low-rank approximation, data-dependent dimensionality reduction. Orthogonal bases and projection matrices.	Slides. Compressed slides. Reading: Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation and the SVD.
15. 10/26	Tue	Best fit subspaces and optimal low-rank approximation via eigendecomposition.	Slides. Compressed slides. Reading: Some notes on PCA and its connection to eigendecomposition. Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation. Proof that optimal low-rank approximation can be found greedily (see Section 1.1). Some good videos for linear algebra review. Some other good videos.
16. 10/28	Thu	Finish up optimal low-rank approximation via eigendecomposition.	Slides. Compressed slides. Reading: Notes on SVD and its connection to eigendecomposition/PCA (Roughgarden and Valiant at Stanford). Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation and the SVD.
17. 11/2	Tue	Singular value decomposition and connections to low-rank approximation. Applications of low-rank approximation beyond compression. Matrix completion.	Slides. Compressed slides. Reading: Notes on SVD and its connection to eigendecomposition/PCA (Roughgarden and Valiant at Stanford). Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation and the SVD. Notes on matrix completion, with proof of recovery under incoherence assumptions (Jelani Nelson at Harvard).
18. 11/4	Thu	Finish application of low-rank approximation to entity embeddings and non-linear dimensionality reduction. Start on spectral graph theory.	Slides. Compressed slides. Reading: Levy Goldberg paper on word embeddings as implicit low-rank approximation.
19. 11/9	Tue	Spectral clustering.	Slides. Compressed slides. Reading: Chapter 10.4 of Mining of Massive Datasets on spectral graph partitioning. For a lot more interesting material on spectral graph methods see Dan Spielman's lecture notes. Great notes on spectral graph methods (Roughgarden and Valiant at Stanford).
11/11	Thu	Veteran's Day. No Class.
20. 11/16	Tue	The stochastic block model.	Slides. Compressed slides. Reading: Stochastic block model notes (Alessandro Rinaldo at CMU). A survey of the vast literature on the stochastic block model, beyond the spectral methods discussed in class (Emmanuel Abbe at Princeton).
21. 11/18	Thu	Computing the SVD: power method.	Slides. Compressed slides. Reading: Chapter 3.7 of Foundations of Data Science on the power method for SVD. Some notes on the power method. (Roughgarden and Valiant at Stanford).
22. 11/23	Tue	Computing the SVD continued: power method analysis. Krylov methods. Connection to random walks and Markov chains.	Slides. Compressed slides.
11/25	Thu	Thanksgiving Recess. No class.
Optimization
23. 11/30	Tue	Start on optimization and gradient descent analysis.	Slides. Compressed slides. Reading: Chapters I and III of these notes (Hardt at Berkeley). Multivariable calc review, e.g., through: Khan academy
24. 12/2	Thu	Gradient descent analysis for convex functions.	Slides. Compressed slides. Reading: Chapters I and III of these notes (Hardt at Berkeley).
25. 12/7	Tue	Constrained optimization and projected gradient descent. Course conclusion/review.	Slides. Compressed slides. Reading (optional on things we won't over): Short notes, proving regret bound for online gradient descent. A good book, (by Elad Hazan) on online optimization, including online gradient descent and connection to stochastic gradient descent.
12/16, 10:30am - 12:30pm		Final Exam.	Study guide and review questions.

Course Schedule (Evolving)

Lecture recordings from Echo360 can be accessed here.