CS 688 Detailed Topics

Below is a detailed list of topics covered in CS 688 in 2022. The actual topics covered this will be similar but may change. References to outside materials (MLPP, PGM, Domke) refer to the resources listed on the main page.

Introduction

Course Intro ([MLPP] Ch 1 or [PGM] p. 1-12)
Random variables and joint distributions ([MLPP] p. 27-30 or [PGM] 19-22)
Slide sources: Domke Lecture 1
Probability Calculus: Marginalization, Conditioning, Chain Rule, Bayes’ Rule. ([MLPP] 29-30 or [PGM] p. 21-22)
Marginal and Conditional Independence ([MLPP] p. 31-32 or [PGM] p. 23-25)

Bayesian networks

Bayesian Network Representation ([MLPP] p. 309-330 or [PGM] p. 51-63)
See also Domke Lecture 1 and Lecture 2
Bayes net Markov Properties ([PGM] 69-76)
The Bayes Ball algorithm
Using independence to simplify queries
KL Divergence ([PGM] 699, [MLPP] 58)
The Likelihood Function ([PGM] 719, [MLPP] 69)
Maximum Likelihood Learning ([PGM] 719-722, [MLPP] 71)
Bernoulli Example ([PGM] 718-719)
The standard parameterization for discrete Bayesian networks. ([PGM] p. 725-726)
The decomposition of the likelihood function over the network. ([PGM] p. 723-725)
Maximum likelihood estimation for individual CPTs. ([PGM] p. 726)

Markov networks

Undirected graphs and factors ([PGM] p. 103-104, 106-108, [MLPP] 663-669)
Joint distribution and partition function ([PGM] p. 105, 108, [MLPP] same as previous)
Gibbs distribution ([PGM] p. 108, [MLPP] 667-668)
Markov properties ([PGM] p. 114-120, [MLPP] 664)
Ising Model ([PGM] p. 124-127, [MLPP] 670)
Conditional random fields
Bayesian networks as Markov networks. ([PGM] 134-137, [MLPP] 664)
Conditioning and the factor reduction algorithm ([PGM] 110-112)
Efficient factor product sums ([PGM] 296)
Variable elimination examples for chains and trees ([PGM] 292-296, [MLPP] 709- 712)
Domke notes on Message Passing: Lecture 9, Lecture 10
Exponential Families: Domke Lecture 11 and Lecture 12 notes
Conditional and Hidden Exponential Families: Domke notes 12 and 13
Asymptotic distribution of maximum-likelihood estimator and Cramer-Rao bound: Domke notes 14 and 15

Markov chain Monte Carlo and Bayesian Inference

Markov chain definition ([PGM] 507, [MLPP] 591)
State sequences and sampling ([PGM] 508)
The t-step distribution ([PGM] 508)
Convergence and the stationary distribution ([PGM] 508-511, [MLPP] 598-206)
See also Domke Lecture 15 and Lecture 16
Limiting distributions
Stationary distributions
Detailed Balance [PGM 515-516]
Designing Markov chains for a distribution P(X) ([PGM] 511-512)
Proof of convergence of the Gibbs sampler to P(X) ([PGM] 512-514)
The Metropolis Hastings Algorithm ([PGM] 514-518, [MLPP] 850-851)
Demo of the random-walk Metropolis Hastings sampler
Heuristics for assessing convergence
Autocorrelation analysis
Hamiltonian Monte Carlo
Reading: Sections 5.1–5.3 of “MCMC Using Hamiltonian Dynamics” by Radford Neal
Bayes rule as updating beliefs in light of evidence ([PGM] 738)
Application of Bayesian inference to learning ([PGM] 737-741)
Bayesian inference and the Beta/Bernoulli model ([PGM] 733-737)

Variational inference

Bayesian latent variable models and unnormalized distributions
The ELBO decomposition and KL divergence minimization
Variational inference by maximizing ELBO
Variational learning and EM
Reading: Domke Lecture 22 notes
Inference in discrete Bayesian network latent variable models
Mean-field variational inference
Coordinate-ascent variational inference (CAVI)
Reading: Variational Inference: A Review for Statisticians by Blei, Kucukelbir, and McAuliffe
VAEs and the re-parameterization trick: https://arxiv.org/pdf/1312.6114.pdf
Black-box variational inference for probabilistic inference: https://arxiv.org/abs/1401.0118
Black-box stochastic variational inference: https://www.cs.toronto.edu/~duvenaud/papers/blackbox.pdf
Stochastic variational learning
Variational auto-encoders
Amortization
Reading: https://arxiv.org/pdf/1312.6114.pdf

Probabilistic programming

Probabilistic programming
Eight-Schools Model
Stan, NumPyro
Reading: Chapter 2 of Ph.D. thesis by Maria Gorinova, Eight-Schools in TensorFlow Probability, Eight-Schools in NumPyro, Stan User’s Guide