9/3 |
Tue |
Course overview. Probability review, Markov's inequality. Estimating set size by counting duplicates. |
Slides. |
Randomized Methods, Sketching & Streaming |
9/5 |
Thu |
Chebyshev's inequality. Random hashing for efficient lookup and load balancing. 2-universal and pairwise independent hashing. |
Slides. Some notes (Arora and Kothari at Princeton) proving that the ax+b mod p hash function described in class in 2-universal. |
9/10 |
Tue |
Union bound. Exponential tail bounds (Bernstein and Chernoff). Example applications. |
Slides. Some notes (Goemans at MIT) showing how to prove the Chernoff bound using the moment generating function + Markov's inequality approach discussed in class. |
9/12 |
Thu |
Hashing continued. Bloom filters and their applications. Hashing for distinct elements. |
Slides. I've added a sketch of the correct Bloom filter analysis. Also see here. See here for some explaination of why a version of a Bloom filter with no false negatives cannot be achieved without using a lot of space. |
9/17 |
Tue |
Distinct elements continued. Flajolet-Martin and HyperLogLog. Jaccard similarity for audio fingerprinting, document comparision, etc. The median trick. |
Slides. The 2007 paper introducing the popular HyperLogLog distinct elements algorithm. Chapter 4 of Mining of Massive Datasets, with content on bloom filters, distinct item counting. |
9/19 |
Thu |
Jaccard similarity search with MinHash. Locality sensitive hashing and nearest neighbor search. |
Slides. Reading: Chapter 3 of Mining of Massive Datasets, with content on Jaccard similarity, MinHash, and locality sensitive hashing. |
9/24 |
Tue |
Finish up MinHash and LSH. SimHash for consine similarity. |
Slides. Reading: Chapter 3 of Mining of Massive Datasets, with content on Jaccard similarity, MinHash, and locality sensitive hashing. |
9/26 |
Thu |
The frequent elements problem. Misra-Gries summaries. Count-min sketch. |
Slides. Reading: Notes (Amit Chakrabarti at Dartmouth) on streaming algorithms. See Chapters 2 and 4 for frequent elements. Some more notes on the frequent elements problem. A website with lots of resources, implementations, and example applications of count-min sketch.
|
10/1 |
Tue |
Randomized dimensionality reduction and the Johnson-Lindenstrauss lemma. Applications to regression, clustering. |
Slides: Compressed/cleaned up, Raw from class. Reading: Chapter 2.7 of Foundations of Data Science on the Johnson-Lindenstrauss lemma. Notes on the JL-Lemma (Anupam Gupta CMU). Linear Algebra Review: Khan academy. |
10/3 |
Thu |
Finish up JL Lemma. |
Slides. The Fast JL transform: speeding up random projection with the Fast Fourier transform. Sparse random projections which can be multiplied by more quickly. JL type random projections for the l1 norm using Cauchy instead of Gaussian random matrices. |
Spectral Methods |
10/8 |
Tue |
Principal component analysis, low-rank approximation, dimensionality reduction. |
Slides: Compressed/cleaned up, Raw from class. Reading: Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation and the SVD. |
10/10 |
Thu |
Eigencomposition and application to PCA and low-rank approximation. |
Slides: Cleaned up, Raw from class. Reading: Some notes on PCA and its connection to eigendecomposition (Roughgarden and Valiant at Stanford). |
10/15 |
Tue |
No Class, Monday Schedule. |
|
10/17 |
Thu |
Midterm (In Class) |
Study guide and review questions. |
10/22 |
Tue |
The singular value decomposition and its connection to eigendecomposition/PCA/low-rank approximation. Applications of low-rank approximation beyond compression. |
Slides (raw from class). Unannotated slides. Reading: Chapter 3 of Foundations of Data Science and Chapter 11 of Mining of Massive Datasets on low-rank approximation and the SVD. Some notes on the SVD and its connection to PCA (Roughgarden and Valiant at Stanford) |
10/24 |
Thu |
Linear algebraic view of graphs. Applications to spectral clustering, community detection, network visualization. |
Slides (raw from class). Unannotated slides. Reading: Chapter 10.4 of Mining of Massive Datasets on spectral graph partitioning. Great notes on spectral graph methods (Roughgarden and Valiant at Stanford). |
10/29 |
Tue |
Spectral graph theory, spectral clustering, and community detection continued. Stochastic block model |
Slides (raw from class). Unannotated slides. Reading: Chapter 10.4 of Mining of Massive Datasets on spectral graph partitioning. For a lot more interesting material on spectral graph methods see Dan Spielman's lecture notes. |
10/31 |
Thu |
Finish up stochastic block model. Computing the SVD: power method, Krylov methods. |
Slides (raw from class). Unannotated slides. Reading: Chapter 3.7 of Foundations of Data Science on the power method for SVD. Some notes on the power method. (Roughgarden and Valiant at Stanford).
|
11/5 |
Tue |
Class Cancelled. |
|
11/7 |
Thu |
Finish up power method. Connection to random walks and Markov chains. |
Slides (raw from class). Unannotated slides. |
Optimization |
>
11/12 |
Tue |
Gradient descent and analysis for convex functions, example applications. |
Slides (raw from class). Unannotated slides. Reading: Chapters I and III of these notes (Hardt at Berkeley). Multivariable calc review, e.g., through: Khan academy. |
11/14 |
Thu |
Finish gradient descent. Projected gradient descent. |
Slides (raw from class). Unannotated slides. Chapters I and III of these notes (Hardt at Berkeley). |
11/19 |
Tue |
Stochastic gradient descent for large scale learning. Analysis via online gradient descent. |
Slides (raw from class). Unannotated slides. Reading: Short notes, proving regret bound for online gradient descent. A good book, (by Elad Hazan) on online optimization, including online gradient descent and connection to stochastic gradient descent. Note that the analysis is close to, but slightly different than what was covered in class. |
11/21 |
Thu |
Finish up SGD. Gradient descent for least squares regression. Connections to advanced techniques: variance reduction, accelerated methods, adaptive gradient methods. |
Slides (raw from class). Unannotated slides. |
11/26 |
Tue |
No Class, Thanksgiving Recess. |
|
11/28 |
Thu |
No Class, Thanksgiving Recess. |
|
Assorted Topics |
12/3 |
Tue |
High-dimensional geometry, curse of dimensionality. |
Slides (raw from class). Unannotated slides. Reading: Chapters 2.3-2.6 of Foundations of Data Science on high-dimensional geometry. |
12/5 |
Thu |
Compressed sensing, sparse recovery. |
Slides (raw from class). Unannotated slides. |
12/10 |
Tue |
Finish up sparse recovery and basis pursuit. Class wrap-up. |
Slides (raw from class). Unannotated slides. |
12/19 |
Thu |
Final (10:30am-12:30pm in Thompson 104) |
Study guide and review questions. |