Graphical Models - CMPSCI 691-GM

Graphical Models
CMPSCI 691GM - Spring 2011

Instructor: Andrew McCallum <mccallum@cs.umass.edu>
TAs: Michael Wick, Sameer Singh
TA office hours: Tuesday 1-2:30pm, Wednesday 2:30-4pm (Rm 264)

News

The course enrollment limit has now been lifted to 35. For some reason the first attempt to do this didn't work, but as of Wednesday SPIRE was allowing additional students to register.

It is likely that this course will count toward the "core AI" requirements as a second or third core after 683; we expect to know for sure after the first faculty meeting of the semester.

Syllabus

Subject to adjustment.

Thu Jan 20	Motivations and background for graphical models. Joint distributions and the curse of dimensionality. Foundations, random variables, Bayesian vs Frequentist, chain rule, Bayes rule, independence and conditional independence, properties of independence. Compressing joint probability distributions. Student survey. Course overview and administrative details. Read chapters 1-2. Slides. (Tuesday Jan 18 snow day.)
Tue Jan 25	Directed graphical model representation. Recipe for a Bayesian network. Graphs, representing independence. Example models. Causal structure. Queries and reasoning patterns, explaining away. Naive Bayes. Representation theorem. Reading independence from the graph, V-structures, D-separation, Bayes Ball method. I-map, I-equivalence, minimal I-map, P-map. Read chapter 3. Slides.
HW#1	Implement a simple directed graphical model of your own design. We will give you some data with a handful of variables. But you are also welcome to find and use data you choose. Set parameters by hand or by simple estimation. Inference exhaustively or by forward sampling. Make up your own experiments: answer some queries; try different structures; try varying amounts of training data. If you have worked with graphical models before, try something new, e.g. if you worked with discrete variables before, try continuous variables. For the very ambitious: try some simple Bayesian network structure learning; see if you can recover the Bayesian network structure that we used to generate the data! Due Thursday February 3.
Thu Jan 27	Local conditional probability distributions. Tables, deterministic, tree-shaped. Context-specific independence. Noisy-OR, Generalized Linear Models. Continuous, dontinuous/discrete combinations. Conditional Bayesian networks and encapsulated CPDs. Examples. Read chapter 5. Slides. (Tuesday Feb 1 snow day.)
Tue Feb 3	Undirected graphical model representation. Motivation, representation. (Preceded by continuation of local conditional probability distributions.) (See slides for Feb 8.)
HW#2	Build a simple undirected graphical model by hand, given some data with a handful of variables. Inference exhaustively or by sampling. Make up your own experiments: answer some queries; try different structures; try varying factor structure. Due Feb 24 and March 1.
Tue Feb 8	Undirected graphical model continued Representation, pairwise Markov networks, factor products and factor reduction, independence, representation theorem revisited, Hammersley-Clifford. Markov blanket, I-Maps and P-Maps. /Factor graphs. Conditional random fields. Read chapter 4.1-4.4. Slides.
Thu Feb 10	Directed & undirected graphical models. Comparison of independence representation. Moralization, Markov blanket. Conversation to/from directed/undirected. Clique trees. Read chapter 4.4-4.7. Slides.
Tue Feb 15	Template-based graphical models. Temporal models. Plate notation. Object-relational domains. Structural uncertainty. (Finish directed-undirected conversion). Read chapter 6. Slides.
Thu Feb 17	Parameter estimation and Lagrange multipliers. Introduction. Simple example. Deriving maximum likelihood for a Bernoulli. Slides.
Tue Feb 22	NO CLASS. Because UMass is following a Monday schedule this day.
Thu Feb 24	Exponential family. Exponential form, Bernoulli, Gaussian. Entropy of a Markov network. Entropy of a Bayesian network. KL Divergence. Read chapter 8. (Guest lecture.) Slides.
Tue Mar 1	Exact inference by variable elimination. Basic elimination. Variable elimination in factor graphs. Elimination orderings. Read chapter 9. Slides.
Thu Mar 3	Exact inference by variable elimination, continued. Dealing with evidence. Complexity. Tree width. introducing clique trees. Slides.
Tue Mar 8	Clique trees. Building a clique tree. Message passing in clique trees. Calibration. Sum-product message passing. Slides.
Thu Mar 10	Belief Update Message Passing. Calibrated clique tree as a graphical model. Clique beliefs. Sepset beliefs. Slides.
HW#3	Implement exact inference by message passing. Simplest would be forward-backward in an HMM, but better (especially if you have done this before), do it in more general tree structures. For the ambitious: junction tree!
Tue Mar 15	SPRING BREAK
Thu Mar 17	SPRING BREAK
Tue Mar 22, 24	Approximate inference as optimization. The energy functional. Mean field approximation. Read chapter 11.1-11.2, 11.5.1. Slides.
Tue Mar 29	Approximate inference as optimization. Variational inference. Loopy belief propagation. Generalized BP. Tree reweighed BP. Read chapter 11.4. Slides.
Thu Mar 31	Approximate inference by sampling. Feedforward sampling in graphical models. Likelihood weighting and importance sampling. Gibbs sampling. Blocked Gibbs sampling. Metropolis-Hastings. Read chapter 12.1-12.4. Slides.
HW#4	Implement and compare two approximate inference methods in your previous models: mean field, loopy-BP, Gibbs sampling.
Tue Apr 5	MAP Inference. Variable elimination. Max-product in trees. Max-product in loopy graphs. Randomized MAP inference. Read chapter 13.1-13.4. Slides.
Thu Apr 7	Parameter estimation in undirected graphical models. Gradient algorithms. Learning for conditional random fields. Slides.
Tue April 12	Parameter estimation in directed graphical models with missing data. Expectation maximization, gradient ascent. Slides.
Thu April 14	Graphical model structure learning. Chow-Liu. Hill climbing. L1 priors. Slides.
Tue April 19	Bayesian latent variable models. Topic models and their applications. (Guest lecture by Hanna Wallach) Slides. Notes.
Thu April 21	Bayesian latent variable models. Continued. (Guest lecture by Hanna Wallach)
HW#5	Choice #1: Implement a topic model of your choice with inference by collapsed Gibbs sampling. For the ambitious: Make it a non-parametric model. Choice #2: Implement parameter estimation for some undirected model. Consider perhaps conditional random fields or restricted Boltzmann machines.
Tue April 26	Bayesian non-parametrics. Dirichlet process mixture models. Slides.
Thu April 28	Causality. (Guest lecture by Marc Maier) Slides.
Tue May 3 Last Class	Review session. Wrap up. Discuss practice questions for final exam.
	MAP Inference. MAP inference as a linear-optimization problem. Linear programming. A taste of convex optimization and dual formulation. Read chapter 13.5.
	Research topics. Semi-supervised learning. EM-based. Graph-based. Margin-based. Generalized expectation criteria.
	Research topics. Probabilistic programming. Probabilistic databases.
	Research topics. Inference with continuous variables. Gaussian belief propagation. Expectation propagation.
	Parameter estimation in undirected graphical models with missing data. Gradient descent. Deep belief networks.

Course Description

Graphical models---the marriage of probability theory and graph theory---have become the lingua franca for describing solutions to a wide array of problems, ranging from computer vision to sensor networks, natural language processing to computational biology. The essence of their power is that they enable the compact representation and manipulation of the exponentially large probability distributions that are required to represent the uncertainty and partial observability that occur in so many real-world problems. This course will cover (a) representation, including Bayesian and Markov networks, dynamic Bayesian networks and relational models, (b) inference, both exact and approximate, and (c) estimation of both parameters and structure of graphical models. Rather than theory and proofs, we will focus on gathering the practical understanding necessary to create and use these models. Thus, there will be a strong emphasis on implementing the methods we study. Although the course is listed as a seminar, it will be taught as a regular lecture course with regular programming assignments and exams. Students entering the class should have a good programming skills and pre-existing working knowledge of probability, statistics, and algorithms. 3 credits.

Book: Probabilistic Graphical Models: Principles and Techniques. Daphne Koller and Nir Friedman. 2009. E.g. Publisher, Amazon.

Philosophy: We learn best by doing. This course will focus on hands-on experience, and thus homeworks will consist largely of small programming assignments. To keep them managable and fun, they will be flexible and open-ended. Let yourself be guided by your own interests and level of ambition.

Grading & Policies

Since this is a new course, these grading policies are subject to adjustment during the course of the semester.

50%	Programming assignments (drop lowest grade, extra credit for doing all)
10%	Quizzes
25%	Final exam
15%	Classroom Participation

Homework submission: Homework is due by email attachment to gm-staff@cs.umass.edu on the date indicated on the homework assignment. Late homework submissions may be accepted in extraordinary circumstances at the discretion of the instructor, but in no case after a solution set has been handed out, and if accepted there will be grading penalties for late assignments.

Rescheduling exams: Exams may be taken other than at the scheduled time, but only under exceptional circumstances and then only if approved by the instructor well before the exam. Makeup exams will rarely be the same as the original exam, and will usually be all or partly oral.

Policy on Regrading: We do make every effort to ensure that your exam or assignment is graded right the first time! However, sometimes people miss things, or there can be disagreements in interpretation. If you're unhappy with the grade for a question, you need to make a written request for a regrade and to resubmit your entire exam or homework, either to one of the TAs or to the instructor. The request doesn't have to be formal and long. Simply writing on a sheet of paper "8 points were taken off question 3, but I think it's a perfectly valid answer to the question" is sufficient. Normally, the TA will regrade it. If you're still not happy, you should repeat this process, but indicate that you want the instructor to re-regrade it. Negating this policy: you should not e-mail grading complaints, and you can't expect assignments to be regraded "while you wait".

Academic Honesty: Your work must be your own, or that of your own team. You are encouraged to discuss problems, ideas and inspirations with other students, but the answers, the programming, the writing, and the final result that you hand in must be your own or your own team's effort. If you have questions about what is honest, please ask! You are strongly encouraged to cite your sources if you received extraordinary help from any person or text (including the Web). Department policy specifies that the penalty for cheating or plagiarism is (1) a final course grade of "F" and (2) possible referral to the Academic Dishonesty Committee. The UMass policy can be found here.

Auditing: If you are interested in auditing the course, please contact the instructor. Official auditors will be expected to complete all of the assignments, and to achieve at least a C-level performance. Anyone enrolled for audit should contact the instructor early in the semester to discuss the requirements for receiving audit credit for this course. If the course is heavily over-enrolled, auditing may not be possible.

Attendance: Students are expected to attend each class. Attendance will not be taken directly, but absence may be noted because of occasional in-class assignment. The official means of communication for this course will be in-class announcements, though every effort will be made to ensure that important announcements go out on the course mailing list or appear on the course web pages.