CMPSCI 691GM : Graphical Models
Spring 2011
Homework #2: Undirected Graphical Models

Due Dates:
Thursday February 24, 2011:   Email working source code
Tuesday March 1, 2011: Email report and revised source code

In this homework assignment you will implement and experiment with undirected graphical models and write a short report describing your experiences and findings. We will provide you with simulated "optical word recognition" data; however, you are welcome to find and use your own data instead.

Optical Word Recognition

We will be studying the computer vision task of recognizing words from images. The task of recognizing words is usually decomposed to recognition of individual characters from their respective images (optical character recognition, OCR), and hence inferring the word. However character recognition is often a very difficult task, and since each character is predicted independent of its neighbors, its results can often contain combinations of characters that may not be possible in English. In this homework we will augment a simple OCR model with additional factors that capture some intuitions based on character co-occurences and image similarities.

Undirected model for a word

The undirected graphical model for recognition of a given word is given in the figure above. It consists of two types of variables:

The model for a word w will consist of len(w) observed image ids, and the same number of unobserved character variables. For a given assignment to these character variables, the model score will be specified using three types of factors:

Given these factors, the probability of an assignment to the character variables of a word w according to our model will be given by:


where Z is the normalization constant as defined using sum over all possible assignments to character variables of the word.

You can download all the data here. The archive contains the following files:

Core Tasks (for everyone)

  1. Graphical Model: Implement the graphical model containing the factors above. For any given assignment to the character variables, your model should be able to calculate the model score. Implemention should allow switching between three models:
    1. OCR model: only contains the OCR factors
    2. Transition model: contains OCR and Transition factors
    3. Combined model: containing all three types of factors
    Note: To avoid errors arising from numerical issues, we suggest you represent the factors in the log-space and take sums as much as possible, calculating the log of the model score.
  2. Exhaustive Inference: Using the graphical model, write code to perform exhaustive inference, i.e. your code should be able to calculate the probability of any assignment of the character and image variables. To calculate the normalization constant Z for the word w, you will need to go through all possible assignments to the character variables (there will be 10len(w) of these).
  3. Model Accuracy: Run your model on the data given in the file data.dat. For every word in the dataset, pick the assignment to character variables that has the highest probability according to the model, and treat this as the model prediction for the word. Using the truth given in truth.dat, compare the accuracy of the model predictions using the following three metrics:
    1. Character-wise accuracy: Ratio of correctly predicted characters to total number of characters
    2. Word-wise accuracy: Ratio of correctly predicted words to total number of words
    3. Average Dataset log-likelihood: For each word given in data.dat, calculate the log of the probability of the true word according to the model. Compute the average of this value for the whole dataset.
    Compare all of the three models described in (1) using these three metrics. Also give some examples of words that were incorrect by the OCR model but consequently fixed by the Transition model, and examples of words that were incorrect by the OCR, partially corrected by the Transition model, and then completely fixed by the Combined model.

Further Fun

Although not required, we hope you will be eager to experiment further with your model. Here are some ideas for additional things to try. Of course, you may come up with some even more exciting ideas to try on your own, and we encourage that. Of course, be sure to tell us what you did in your write-up.

What to hand in

The homework should be emailed to 691gm-staff@cs.umass.edu. before 5pm Eastern time on the due date.

Grading

The assignment will be graded on (a) core task completion and correctness, (b) effort and creativity in the optional extras (c) quality and clarity of your written report.

Questions?

Please ask! Send email to 691gm-staff@cs.umass.edu or come to the office hours. If you'd like your classmates to be able to help answer your question, feel free to use 691gm-all@cs.umass.edu.