The following are draft chapters from a forthcoming book on linear sketching and data stream algorithms. Please email any corrections and comments. The files are password-protected so also email me for a password.
Chapter | Last Updated | Comments |
Introduction | N/A | N/A |
Algorithms for Signals | 10/27/13 | N/A |
Algorithms for Matrices | N/A | N/A |
Algorithms for Graphs (see here for a related survey on graph streams) | 1/16/14 | N/A |
Algorithms for Geometric Problems | N/A | N/A |
Algorithms for Sequences | N/A | N/A |
Lower Bounds and Communication Complexity | N/A | N/A |
Sliding Windows | N/A | N/A |
Stochastic Streams | N/A | N/A |
Distributed Streams | N/A | N/A |
- Numbers:
- Section 1-1: Sampling
- Section 1-2: Intro to Sketches: F0 and F2
- Section 1-3: Multi-Purpose Sketches: Count-Min, Count-Sketch for Heavy Hitters, Range Queries, Quantiles
- Section 1-4: Sampling via Sketches: Lp sampling and Applications
- Section 1-5: Approximate Representations and Signal Reconstruction
- Associated Papers:
- An Improved Data Stream Summary: The Count-Min Sketch and its Applications (Cormode, Muthukrishnan)
- The space complexity of approximating the frequency moments (Alon, Matias, Szegedy)
- Streaming Algorithms from Precision Sampling (Andoni, Krauthgamer, Onak)
- Tight Bounds for Lp Samplers, Finding Duplicates in Streams, and Related Problems (Jowhari, Saglam, Tardos)
- Graphs:
- Section 2-1: Connectivity, k-connectivity, Spanners, Sparsification
- Section 2-2: Graph Sketches
- Section 2-3: Matchings and Bipartiteness
- Associated Papers:
- Analyzing Graph Structure via Linear Measurements (Ahn, Guha, McGregor)
- On Graph Problems in a Semi-streaming Model (Feigenbaum et al.)
- Finding Matchings in the Streaming Model (McGregor)
- Clustering and Geometry:
- Section 3-1: Coresets and Clustering
- Section 3-2: Gridbased Sketching
- Associated Papers:
- Sequences:
- Section 4-1: Longest Increasing Sequences and DYCK
- Section 4-2: Memory Checking
- Associated Papers:
- Recognizing well-parenthesized expressions in the streaming model (Magniez, Mathieu, Nayak)
- Information Cost Tradeoffs for Augmented Index and Streaming Language Recognition (Chakrabarti, Cormode, Kondapally, McGregor)
- Estimating the Sortedness of a Data Stream (Gopalan, Jayram, Krauthgamer, Kumar)
- Lower Bounds:
- Section 5-1: Basic Connections, Examples, Hamming Distance
- Section 5-2: Information Statistics and Disjointness
- Associated Papers:
- An information statistics approach to data stream and communication complexity (Bar-Yossef, Jayram, Kumar, Sivakumar)
- The One-Way Communication Complexity of Hamming Distance (Jayram, Kumar, Sivakumar)
- Special Topics:
- Section 6-1: Sliding Windows
- Section 6-2: Distributed Streams and Functional Monitoring
- Section 6-3: Stochastic Streams
- Associated Papers:
- Selection and Sorting with Limited Storage (Munro and Paterson)
- Smooth Histograms for Sliding Windows (Braverman and Ostrovsky)
- Optimal Random Sampling from Distributed Streams Revisited (Tirthapura and Woodruff)
- Algorithms for Distributed Functional Monitoring (Cormode, Muthukrishnan, Yi)