CMPSCI 591N : Computational Linguistics
Spring 2006
Homework #6: Probabilistic Context Free Grammars

Out: Tuesday April 11, 2006
Due: Tuesday April 18, 2006, by 11:59pm, by email to compling@cs.umass.edu

In this homework assignment you will implement and experiment with a probabilistic (weighted) context free grammar, and write a short report about your experiences and findings.

You may begin with your context free CYK parser from homework #2. Change this implementation so that it accepts a grammar with weights (log-probabilities, or negative log-probabilities) on production rules, and implement dynamic programming so that is properly accumulates the total weights of various parses.

You should then explore some parsing issue that leverages such a weighted grammar. There are several suggestions below. You are also free to show your creativity by creating your own substantive exploratory question.

Everyone should estimate the parameters of an HMM from counts (as we did in class on the board), and implement Viterbi, as described in the first bullet below. There are additional bullets below describing further optional exercises. As usual, you need not be limited by the suggestions of these extra bullets. I you are free to come up with your own tasks.

Please re-check this page as well as the course Web site syllabus, in the homework column for any updates and clarifications to this assignment.

Python and Data Infrastructure available

You may begin with cfg.py which is available at http://www.cs.umass.edu/~mccallum/courses/cl2006/code. And/or your own assignment for HW#2.A

As with HW#2, we are not providing training or testing data, but you may make up your own data.

Tasks

What to hand in, and how

The homework should be emailed to compling@cs.umass.edu before 11:59pm on Tuesday April 18, 2006.

In addition to writing your Python program, write a short report about your experiences. Feel free to suggest other additional things you might like to to next that build on what you've done so far. This report should be clear, well-written, but needn't be long--one page is fine. Also, no need for fancy formatting. In fact, we prefer to receive this report as the body of your email. Your program can also be included in the body, or included as an email attachment.

Grading

The assignment will be graded for (a) correctness of your implementation, (b) quality/clarity of your written report, and (d) creativity, effort and success in the task(s) you choose.

Questions?

Feel free to ask! Send email to compling@cs.umass.edu, or if you'd like your classmates to be able to help answer your question, use compling-class@cs.umass.edu.