CMPSCI 585 : Introduction to Natural Language Processing
Fall 2007
Homework #6: Maximum Entropy Classifier


In this homework assignment you will implement and experiment with a maximum entropy classifier, and write a short report about your experiences and findings.

You may begin with source code provided by Prof. McCallum; you may alternatively start from scratch if you prefer. The main task if you begin with the provided code is to implement the gradient function, then train and test your classifier in various ways you find interesting. See the tasks below.

See the class slides. For significantly more detail, you might also want to see More pointers are available at

Please re-check this page as well as the course Web site syllabus, in the homework column for any updates and clarifications to this assignment.

Python and Data Infrastructure available

You may begin with and which is available at

The package depends on the Python Numeric package, which you will also have to install if you don't have it already. (Numeric is deprecated in favor of NumPy, but the only version of that we could find depends on the old Numeric instead.) The package also imports MLab, but note that this is provided by the Numeric installation.

As with HW#4, we are providing training and testing data in the form of spam and ham email, but you are welcome to find your own data.


What to hand in, and how

The homework should be emailed to

In addition to writing your Python program, write a short report about your experiences. Feel free to suggest other additional things you might like to to next that build on what you've done so far. This report should be clear, well-written, but needn't be long--one page is fine. Also, no need for fancy formatting. In fact, we prefer to receive this report as the body of your email. Your program can also be included in the body, or included as an email attachment.


The assignment will be graded for (a) correctness of your implementation, (b) quality/clarity of your written report, and (d) creativity, effort and success in the task(s) you choose.


Feel free to ask! Send email to, or if you'd like your classmates to be able to help answer your question, use