CMPSCI 691GM : Graphical Models
Spring 2011
Homework #4: Approximate Inference for Undirected Graphical Models
Due Dates:
Thursday April 21, 2011: Email working source code
Tuesday April 26, 2011: Email report and revised source code
Note: In this homework we will build upon the "further fun"
task from the
previous homework.
In this homework assignment you will implement and experiment with new
inference algorithms for loopy undirected graphical models and write a
short report describing your experiences and findings. As in last
homework, we will provide you with simulated "optical word
recognition" data; however, you are welcome to find and use your own
data instead.
OCR Model
Recall from the previous assignment that the complete model score
over pairs of words w1 and w2 consists of the
following four classes of factors
- OCR Factors,
:
These factors capture the predictions of a character-based OCR system,
and hence exist between every image variable and its corresponding
character variable. The number of these factors for the pair is len(w1)+len(w2).
The value of factor between an image variable and the character
variable at position i is dependent on img(i) and char(i),
and is stored in ocr.dat file described in the last homework.
- Transition Factors,
:
Since we also want to represent the co-occurence frequencies of the
characters in our model, we add these factors between all consecutive
character variables for each word. The number of these factors of word w
is (len(w1)-1) + (len(w2)-1). The value of factor between two
character variables at positions i and i+1 is dependent
on char(i) and char(i+1). These values are given to you
in trans.dat file described in the last homework.
- Skip Factors,
:
These factors exist between pairs of image variables that belong to
the same word and have the same id. The value of this factor
depends on the characters, i.e. its value is 5.0 if the
characters are the same, and 1.0 otherwise.
- Pair-Skip Factors,
:
These factors exist between pairs of image variables that belong to
different words and have the same id. The value of this factor
depends on the characters, i.e. its value is 5.0 if the
characters are the same, and 1.0 otherwise.
An unfortunate consequence of these particular factors is that they can produce loopy-structures that make inference difficult (see the "further fun" section of Homework #3 for the types of loopy structures present in the data).
Fortunately, you now have new arrows in your quiver to help tackle
these more complex models (e.g., mean field, loopy belief propagation, Gibbs sampling, and junction tree).
Data
We will use the dataset provided in the previous homework which can be downloaded here.
Recall that the archive contains the following files:
- ocr.dat, trans.dat: Same as Homework 2 (and 3).
- data-tree.dat (and truth-tree.dat): These files are not necessarily required for this homework assignment, but are provided to test your approaches. They contain all the tree-structured instances of the OCR task. The observed dataset (data-tree.dat) consists observed images of one word on each row (same format as
before), with a empty line between pairs. The true words for this pair
of observed images are stored respectively as rows in truth-tree.dat
(with the empty lines). As before, you should iterate through both the
files together to ensure you have the true words for each pair, along
with their observed images.
- "loopy" data (...): These files contain all the loopy structures in the dataset that you will need for this homework. data-*.dat and truth-*.dat contain pairs of words for loopy graphical models (described in optional tasks), and are in the same format as data-tree.dat and truth-tree.dat.
Core Tasks (for everyone)
In the core tasks you will have the oppurtunity to implement and
evaluate two inference algorithms of your choice from the following:
loopy belief propagation, mean field, Gibbs sampling and exact
inference (message passing in a junction tree) on the loopy structures
from the dataset. You will compare these algorithms using the same
evaluation metrics you used in the previous homework (the
per-character and per-word accuracies from the maximum-marginal
predictions). Here is a check list of the core tasks:
- Implement two of the following
- Loopy belief propagation: Implement loopy belief
propagation (generalize your previous implementation of message passing to handle the loopy structures present in the dataset). Because loopy graphs do not have "leaf nodes", your implementation will not be able to start at the leaves. Instead, you will need to devise a "schedule" that determines the order that the messages are passed (for example, this schedule can be from a random ordering you impose on the variables).
- Mean field: Implement the simple mean-field algorithm from class where the approximating distribution Q is the simple completely disconnected graph (all variables are independent).
- Gibbs sampling: Implement Gibbs sampler. Recall that a Gibbs sampler works in rounds. Each round, iterate through all the character variables, and for each one, sample a value for that character conditioned on its Markov blanket. Use the counts obtained to approximate the marginals for each variable.
- Message Passing in a junction tree: as described in the "further fun" of Homework #3.
- Experiments: Compare the two algorithms you implement in terms of accuracy and run time (over the three datasets) as described below. You only need to include two of the following four algorithms in your table.
| Algorithm |
Per-char accuracy (max marginal) |
Per-word accuracy (max marginal) |
Running time |
| Exact (Junction tree + MP) |
| Loopy BP |
| Mean field |
| Gibbs sampling |
Note that the approximate inference algorithms work in "rounds". For example, one round of Gibbs sampling is a complete sweep through each variable, one round of loopy BP is one complete execution of your message passing schedule, and mean-field updates a single variable at a time. Please discuss in your report how "long" you ran each algorithm for (plotting accuracy versus rounds will be helpful) and what stopping criteria you used. The stopping criteria could be a heuristic based on some measurement of convergence that you devise, or you might use trial-and-error to determine an appropriate number of rounds to run the algorithm. Please describe either the stopping criteria you used or the number of rounds you ran each algorithm for.
- Further fun: use your own creativity to further explore the space of approximate inference algorithms. Suggestions provided below.
Further Fun
- Implement and compare against a third inference algorithm, some suggestions are given below.
- Variants on belief propagation: For example, try implementing
residual
belief propagation, which computes "residuals" and uses them as a
heuristic to schedule messages in an adaptive manner. Alternatively,
we encourage you to try your own scheduling schemes and compare them.
- Generalized mean field Recall that the mean field derived in
class was for an approximating distribution where all the variables
are independent of eachother. Generalized mean field gives us the
power to pick any distribution Q that allows us to do tractable
inference. In particular, we can choose Q to be a linear
chain. Implement a generalized mean field algorithm where Q is either
a chain or some other distribution of your choice.
- Blocked-Gibbs sampling Instead of sampling a single variable at
a time, sample two or more variables at a time. Compare different
sampling strategies.
- Metropolis-Hastings Metropolis-Hastings is a generalization of
Gibbs sampling that enables you to customize your own proposal
distribution q from which efficient samples may be drawn. Try
implementing several different proposal distributions and compare them
with the algorithms in the core tasks. You may want to investigate
what the acceptance rate of your proposal distribution is (where the
acceptance rate it the number of proposals accepted over the total
number of proposals).
- Maximum a posteriori (MAP) Implement the MAP variants of the inference algorithms you chose in the core tasks. Compare the accuracy you obtain using the MAP configuration to the maximum marginals.
- Another task/domain: Approximate inference has allowed complex graphical models to be applied to a wide variety of problems. Apply the algorithms you implemented here to a new model on a new problem (perhaps related to your own research interests). Here is one suggestion: mean field and Gibbs sampling have been successful for inference in ising (grid) models (e.g., used for
computer vision). You might want to try the following task of
denoisifying a black-and-white image.
- Take a black and white image which we will use as the ground truth for evaluating the accuracy of our model. Next, create a noisy copy of this image by adding a small amount of random noise to each pixel (that is, randomly flip the bits of some of the pixels from a zero to one or vice-versa). The model (described below) will be designed to de-noise this image.
-
Create an undirected model as follows. There will be two sets of binary variable types (observed x and hidden y) as well as two types of factors: one between a single observed and a single hidden variable f(x,y), and the other between two hidden variables f(y,y).
Think of the image as a n by m grid. For each pixel p(i,j) create an observed variable x(i,j) indicating the observed value (zero/one) of that pixel. Also for each pixel create a hidden variable y(i,j) indicating the true value for that pixel. Each y(i,j) variable should be connected to its corresponding observation x(i,j) with a factor f(x,y). Intuitively, this factor should be set so that the model wants the true value of the pixel to agree with the observed value of the pixel. Finally, place a factor f(y,y) between pairs of y(i,j) variables so that Markov blanket of each y(i,j) variable has up to four variables: {y(i+1,j+1), y(i-1,j+1),y(i+1,j-1),y(i-1,j-1)}. These factors will take two hidden y's and output a compatibility score for them. Intuitively, this factor should be set to encourage neighboring pixels to have the same value.
- Experiment with different amounts and types of noise and try tuning the parameters of your model to denoisify the image. Evaluate accuracy (e.g., per-pixel accuracy) and report your findings.
What to hand in
The homework should be emailed to 691gm-staff@cs.umass.edu.
before 5pm Eastern time on the due date.
- You should provide a short (1-3 page) report describing your
explorations and results. Include a description of the implementation
of your model. Compare the approximate (mean-field and loopy belief propagation) to the exact inferenace algorithms on both the tree-structured and loopy-structured datasets. Comparison should include accuracy metrics and running time. Also include details of
the optional tasks you completed.
- Include the complete source code of your implementation.
Grading
The assignment will be graded on (a) core task completion and
correctness,
(b) effort and creativity in the optional extras (c) quality and
clarity of your written report.
Questions?
Please ask! Send email to 691gm-staff@cs.umass.edu or come to the
office hours. If you'd like your classmates to be able to help answer
your question, feel free to use 691gm-all@cs.umass.edu.