Lecture      | Day | Topic | Materials/Reading |
---|---|---|---|
1. 9/2 | Thu | Course overview. Probability review. | Slides. Compressed slides. MIT short videos and exercises on probability. Khan academy probability lessons (a bit more basic). |
Randomized Methods, Sketching & Streaming | |||
2. 9/7 | Tue | Estimating set size by counting duplicates. Concentration Bounds: Markov's inequality. Random hashing for efficient lookup. | Slides. Compressed slides. |
3. 9/9 | Thu | 2-level hashing. 2-universal and pairwise independent hashing. Hashing for load balancing. | Slides. Compressed slides. Some notes (Arora and Kothari at Princeton) proving that the ax+b mod p hash function described in class in 2-universal. |
4. 9/14 | Tue | Finish up hashing for load balancing. Chebyshev's inequality. The union bound. | Slides. Compressed slides. |
5. 9/16 | Thu | Exponential concentration bounds and the central limit theorem. | Slides. Compressed slides. |
6. 9/21 | Tue | Bloom Filters. | Slides. Compressed slides. Reading: Chapter 4 of Mining of Massive Datasets, with content on bloom filters and distinct elements counting. See here for full Bloom filter analysis. See here for some explaination of why a version of a Bloom filter with no false negatives cannot be achieved without using a lot of space. See Wikipedia for a discussion of the many bloom filter variants, including counting Bloom filters, and Bloom filters with deletions. See Wikipedia again and these notes for an explaination of Cuckoo Hashing, a randomized hash table scheme which, like 2-level hashing, has O(1) query time, but also has expected O(1) insertion time. |
7. 9/23 | Thu | Min-Hashing for Distinct Elements. The median trick. | Slides. Compressed slides. Reading: Chapter 4 of Mining of Massive Datasets, with content on distinct elements counting. |
8. 9/28 | Tue | Distinct elements in pratice: Flajolet-Martin and HyperLogLog. Start on Jaccard similarity and near neighbor search. | Slides. Compressed slides. Reading: Chapter 3 of Mining of Massive Datasets, with content on Jaccard similarity, MinHash, and locality sensitive hashing. The 2007 paper introducing the popular HyperLogLog distinct elements algorithm. |
9. 9/30 | Thu | Jaccard similarity estimation with MinHash. Locality sensitive hashing for fast similarity search. | Slides. Compressed slides. Reading: Chapter 3 of Mining of Massive Datasets, with content on Jaccard similarity, MinHash, and locality sensitive hashing. |
10. 10/5 | Tue | The frequent elements problem and count-min sketch. | Slides. Compressed slides. Reading: Notes (Amit Chakrabarti at Dartmouth) on streaming algorithms. See Chapters 2 and 4 for frequent elements. Some more notes on the frequent elements problem. A website with lots of resources, implementations, and example applications of count-min sketch. |
11. 10/7 | Thu | Dimensionality reduction, low-distortion embeddings, and the Johnson Lindenstrauss Lemma. | Slides. Compressed slides. Reading: Chapter 2.7 of Foundations of Data Science on the Johnson-Lindenstrauss lemma. Notes on the JL-Lemma (Anupam Gupta CMU). Sparse random projections which can be multiplied by more quickly. Linear Algebra Review: Khan academy. |
12. 10/12 | Tue | Finish up Johnson-Lindenstrauss lemma proof. High-dimensional geometry and its connection to the JL Lemma. | Slides. Compressed slides. Reading: Chapters 2.3-2.6 of Foundations of Data Science on high-dimensional geometry. |
13. 10/14 | Thu | Finish up connections between JL and high-dimensional geometry. Midterm Review. | Slides. |
10/19 | Tue | Midterm | Study guide and review questions. |
Spectral Methods | 14. 10/21 | Thu | Intro to principal component analysis, low-rank approximation, data-dependent dimensionality reduction. Orthogonal bases and projection matrices. | Slides. Compressed slides. Reading: Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation and the SVD. |
15. 10/26 | Tue | Best fit subspaces and optimal low-rank approximation via eigendecomposition. | Slides. Compressed slides. Reading: Some notes on PCA and its connection to eigendecomposition. Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation. Proof that optimal low-rank approximation can be found greedily (see Section 1.1). Some good videos for linear algebra review. Some other good videos. | 16. 10/28 | Thu | Finish up optimal low-rank approximation via eigendecomposition. | Slides. Compressed slides. Reading: Notes on SVD and its connection to eigendecomposition/PCA (Roughgarden and Valiant at Stanford). Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation and the SVD. |
17. 11/2 | Tue | Singular value decomposition and connections to low-rank approximation. Applications of low-rank approximation beyond compression. Matrix completion. | Slides. Compressed slides. Reading: Notes on SVD and its connection to eigendecomposition/PCA (Roughgarden and Valiant at Stanford). Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation and the SVD. Notes on matrix completion, with proof of recovery under incoherence assumptions (Jelani Nelson at Harvard). |
18. 11/4 | Thu | Finish application of low-rank approximation to entity embeddings and non-linear dimensionality reduction. Start on spectral graph theory. | Slides. Compressed slides. Reading: Levy Goldberg paper on word embeddings as implicit low-rank approximation. |
19. 11/9 | Tue | Spectral clustering. | Slides. Compressed slides. Reading: Chapter 10.4 of Mining of Massive Datasets on spectral graph partitioning. For a lot more interesting material on spectral graph methods see Dan Spielman's lecture notes. Great notes on spectral graph methods (Roughgarden and Valiant at Stanford). |
11/11 | Thu | Veteran's Day. No Class. | |
20. 11/16 | Tue | The stochastic block model. | Slides. Compressed slides. Reading: Stochastic block model notes (Alessandro Rinaldo at CMU). A survey of the vast literature on the stochastic block model, beyond the spectral methods discussed in class (Emmanuel Abbe at Princeton). |
21. 11/18 | Thu | Computing the SVD: power method. | Slides. Compressed slides. Reading: Chapter 3.7 of Foundations of Data Science on the power method for SVD. Some notes on the power method. (Roughgarden and Valiant at Stanford). |
22. 11/23 | Tue | Computing the SVD continued: power method analysis. Krylov methods. Connection to random walks and Markov chains. | Slides. Compressed slides. |
11/25 | Thu | Thanksgiving Recess. No class. | |
Optimization | |||
23. 11/30 | Tue | Start on optimization and gradient descent analysis. | Slides. Compressed slides. Reading: Chapters I and III of these notes (Hardt at Berkeley). Multivariable calc review, e.g., through: Khan academy |
24. 12/2 | Thu | Gradient descent analysis for convex functions. | Slides. Compressed slides. Reading: Chapters I and III of these notes (Hardt at Berkeley). |
25. 12/7 | Tue | Constrained optimization and projected gradient descent. Course conclusion/review. | Slides. Compressed slides. Reading (optional on things we won't over): Short notes, proving regret bound for online gradient descent. A good book, (by Elad Hazan) on online optimization, including online gradient descent and connection to stochastic gradient descent. |
12/16, 10:30am - 12:30pm | Final Exam. | Study guide and review questions. |