Course Schedule (Evolving)

Lecture recordings from Echo360 can be accessed here.

Lecture      Day Topic Materials/Reading
1. 9/2 Thu Course overview. Probability review. Slides. Compressed slides.
MIT short videos and exercises on probability. Khan academy probability lessons (a bit more basic).
Randomized Methods, Sketching & Streaming
2. 9/7 Tue Estimating set size by counting duplicates. Concentration Bounds: Markov's inequality. Random hashing for efficient lookup. Slides. Compressed slides.
3. 9/9 Thu 2-level hashing. 2-universal and pairwise independent hashing. Hashing for load balancing. Slides. Compressed slides. Some notes (Arora and Kothari at Princeton) proving that the ax+b mod p hash function described in class in 2-universal.
4. 9/14 Tue Finish up hashing for load balancing. Chebyshev's inequality. The union bound. Slides. Compressed slides.
5. 9/16 Thu Exponential concentration bounds and the central limit theorem. Slides. Compressed slides.
6. 9/21 Tue Bloom Filters. Slides. Compressed slides. Reading: Chapter 4 of Mining of Massive Datasets, with content on bloom filters and distinct elements counting. See here for full Bloom filter analysis. See here for some explaination of why a version of a Bloom filter with no false negatives cannot be achieved without using a lot of space. See Wikipedia for a discussion of the many bloom filter variants, including counting Bloom filters, and Bloom filters with deletions. See Wikipedia again and these notes for an explaination of Cuckoo Hashing, a randomized hash table scheme which, like 2-level hashing, has O(1) query time, but also has expected O(1) insertion time.
7. 9/23 Thu Min-Hashing for Distinct Elements. The median trick. Slides. Compressed slides. Reading: Chapter 4 of Mining of Massive Datasets, with content on distinct elements counting.
8. 9/28 Tue Distinct elements in pratice: Flajolet-Martin and HyperLogLog. Start on Jaccard similarity and near neighbor search. Slides. Compressed slides. Reading: Chapter 3 of Mining of Massive Datasets, with content on Jaccard similarity, MinHash, and locality sensitive hashing. The 2007 paper introducing the popular HyperLogLog distinct elements algorithm.
9. 9/30 Thu Jaccard similarity estimation with MinHash. Locality sensitive hashing for fast similarity search. Slides. Compressed slides. Reading: Chapter 3 of Mining of Massive Datasets, with content on Jaccard similarity, MinHash, and locality sensitive hashing.
10. 10/5 Tue The frequent elements problem and count-min sketch. Slides. Compressed slides. Reading: Notes (Amit Chakrabarti at Dartmouth) on streaming algorithms. See Chapters 2 and 4 for frequent elements. Some more notes on the frequent elements problem. A website with lots of resources, implementations, and example applications of count-min sketch.
11. 10/7 Thu Dimensionality reduction, low-distortion embeddings, and the Johnson Lindenstrauss Lemma. Slides. Compressed slides. Reading: Chapter 2.7 of Foundations of Data Science on the Johnson-Lindenstrauss lemma. Notes on the JL-Lemma (Anupam Gupta CMU). Sparse random projections which can be multiplied by more quickly. Linear Algebra Review: Khan academy.
12. 10/12 Tue Finish up Johnson-Lindenstrauss lemma proof. High-dimensional geometry and its connection to the JL Lemma. Slides. Compressed slides. Reading: Chapters 2.3-2.6 of Foundations of Data Science on high-dimensional geometry.
13. 10/14 Thu Finish up connections between JL and high-dimensional geometry. Midterm Review. Slides.
10/19 Tue Midterm Study guide and review questions.
Spectral Methods
14. 10/21 Thu Intro to principal component analysis, low-rank approximation, data-dependent dimensionality reduction. Orthogonal bases and projection matrices. Slides. Compressed slides. Reading: Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation and the SVD.
15. 10/26 Tue Best fit subspaces and optimal low-rank approximation via eigendecomposition. Slides. Compressed slides. Reading: Some notes on PCA and its connection to eigendecomposition. Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation. Proof that optimal low-rank approximation can be found greedily (see Section 1.1). Some good videos for linear algebra review. Some other good videos.
16. 10/28 Thu Finish up optimal low-rank approximation via eigendecomposition. Slides. Compressed slides. Reading: Notes on SVD and its connection to eigendecomposition/PCA (Roughgarden and Valiant at Stanford). Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation and the SVD.
17. 11/2 Tue Singular value decomposition and connections to low-rank approximation. Applications of low-rank approximation beyond compression. Matrix completion. Slides. Compressed slides. Reading: Notes on SVD and its connection to eigendecomposition/PCA (Roughgarden and Valiant at Stanford). Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation and the SVD. Notes on matrix completion, with proof of recovery under incoherence assumptions (Jelani Nelson at Harvard).
18. 11/4 Thu Finish application of low-rank approximation to entity embeddings and non-linear dimensionality reduction. Start on spectral graph theory. Slides. Compressed slides. Reading: Levy Goldberg paper on word embeddings as implicit low-rank approximation.
19. 11/9 Tue Spectral clustering. Slides. Compressed slides. Reading: Chapter 10.4 of Mining of Massive Datasets on spectral graph partitioning. For a lot more interesting material on spectral graph methods see Dan Spielman's lecture notes. Great notes on spectral graph methods (Roughgarden and Valiant at Stanford).
11/11 Thu Veteran's Day. No Class.
20. 11/16 Tue The stochastic block model. Slides. Compressed slides. Reading: Stochastic block model notes (Alessandro Rinaldo at CMU). A survey of the vast literature on the stochastic block model, beyond the spectral methods discussed in class (Emmanuel Abbe at Princeton).
21. 11/18 Thu Computing the SVD: power method. Slides. Compressed slides. Reading: Chapter 3.7 of Foundations of Data Science on the power method for SVD. Some notes on the power method. (Roughgarden and Valiant at Stanford).
22. 11/23 Tue Computing the SVD continued: power method analysis. Krylov methods. Connection to random walks and Markov chains. Slides. Compressed slides.
11/25 Thu Thanksgiving Recess. No class.
Optimization
23. 11/30 Tue Start on optimization and gradient descent analysis. Slides. Compressed slides. Reading: Chapters I and III of these notes (Hardt at Berkeley). Multivariable calc review, e.g., through: Khan academy
24. 12/2 Thu Gradient descent analysis for convex functions. Slides. Compressed slides. Reading: Chapters I and III of these notes (Hardt at Berkeley).
25. 12/7 Tue Constrained optimization and projected gradient descent. Course conclusion/review. Slides. Compressed slides. Reading (optional on things we won't over): Short notes, proving regret bound for online gradient descent. A good book, (by Elad Hazan) on online optimization, including online gradient descent and connection to stochastic gradient descent.
12/16, 10:30am - 12:30pm Final Exam. Study guide and review questions.