Date 
Topics 
Readings 
Assignments 
Tue
Sep 4 
Introduction and Overview
Welcome, motivations, what is Natural Language Processing, handson demonstrations. Ambiguity and uncertainty in language. The Turing test. Course outline and logistics. Questionaire. Handout. Slides. 
JM Ch 1
Optional:
MS Ch 1, historical overview. 

Thu
Sep 6 
Regular Expressions
Chomsky hierarchy, regular languages, and their limitations. Finitestate automata. Practical regular expressions for finding and counting language phenomena. A little morphology. In class demonstrations of exploring a large corpus with regex tools. Slides. 
JM Ch 2 
Install Python. HW#1 out: RegEx on corpora. Tools. 
Tue
Sep 11 
Programming in Python
An introduction to programming in Python. Why Python? Variables, numbers, strings, arrays, dictionaries, conditionals, iteration. The NLTK (Natural Language Toolkit), with demonstrations. Slides. 
Refer to online programming resources, and Learning Python, at your own pace. 

Thu
Sep 13 
String Edit Distance and Alignment
Key algorithmic tool: dynamic programming, first a simple example, then its use in optimal alignment of sequences. String edit operations, edit distance, and examples of use in spelling correction, and machine translation. Slides. 
JM Ch 3.11
Optional extras: web 
HW#1 due.
HW#2 out: String edit distances 
Tue
Sep 18 
Context Free Grammars
Constituency, CFG definition, use and limitations. Chomsky Normal Form. Topdown parsing, bottomup parsing, and the problems with each. The desirability of combining evidence from both directions. Slides. 
JM Ch 13.113.3 

Thu
Sep 20 
Nonprobabilistic Parsing
Efficient CFG parsing with CYK, another dynamic programming algorithm. Also, perhaps, the Earley parser. Designing a little grammar, and parsing with it on some test data. Slides. 
JM Ch 13.4 
HW#2 due.
HW#3 out: Designing a little grammar, and parsing with CYK. 
Tue
Sep 25 
Probability
Introduction to probability theorythe backbone of modern natural language processing. Events, and counting. Joint and conditonal probability, marginals, independence, Bayes rule, combining evidence. Examples of applications in natural language. (Plus: use a little calculus!?) Slides. 


Thu
Sep 27 
Information Theory
What is information? Measuring it in bits. The "noisy channel model." The "Shannon game"motivated by language! Entropy, crossentropy, information gain. Its application to some language phenomena. Slides. 
JM Ch 4.104.11 
HW#3 due. 
Tue
Oct 2 
Information Theory, continued
Including helpful a quiz. 


Thu
Oct 4 
Language modeling and Naive Bayes
Probabilistic language modeling and its applications. Markov models. Ngrams. Estimating the probability of a word, and smoothing. Generative models of language. Their application to building an automaticallytrained email spam filter, and automatically determining the language (English, French, German, Dutch, Finnish, Klingon?). Slides. 
JM Ch 4.14.9 
HW#4 out: Choice: Building a spam filter, or language id 
Tue
Oct 9 
NO CLASS (This Tuesday follows UMass Monday schedule.) Optional midterm review session 5:00pm. 


Thu
Oct 11 
Midterm 


Tue
Oct 16 
Part of Speech Tagging and Hidden Markov Models
The concept of partsofspeech, examples, usage. The Penn Treebank and Brown Corpus. Probabilistic (weighted) finite state automata. Hidden Markov models (HMMs), definition and use. Slides. 
JM Ch 5 
HW#4 due. HW#5 out: Build a partofspeech tagger. 
Thu
Oct 18 
Viterbi Algorithm for Finding Most Likely HMM Path
Dynamic programming with Hidden Markov Models, and its use for partofspeech tagging, Chinese word segmentation, prosody, information extraction, etc. (No slides, board work only.) 
JM Ch 6.16.4 

Tue
Oct 23 
Probabilistic Context Free Grammars
Weighted context free grammars. Weighted CYK. Pruning and beam search. Slides. 
JM Ch 12 

Thu
Oct 25 
Parsing with PCFGs
A treebank and what it takes to create one. The probabilistic version of CYK. Also: How do humans parse? Experiments with eyetracking. Modern parsers. Slides. 
JM Ch 13 
HW#5 due Friday Oct 26.
HW#6 out: Build a Weighted PCFG for a little language. 
Tue
Oct 30 
Maximum Entropy Classifiers
The maximum entropy principle, and its relation to maximum likelihood. The need in NLP to integrate many pieces of weak evidence. Maximum entropy classifiers and their application to document classification, sentence segmentation, and other language tasks. Slides. 
JM Ch 6.66.7 

Thu
Nov 1 
Maximum Entropy Markov Models & Conditional Random Fields
Partofspeech tagging, nounphrase segmentation and information extraction models that combine maximum entropy and finitestate machines. Stateoftheart models for NLP. Guest lecture by Greg Druck. Slides. 
JM Ch 6.8 
HW#6 due.
HW#7 out: Build and apply a maximum entropy classifier. 
Tue
Nov 6 
Lexical Semantics
Guest lecture by Chris Potts, Professor, UMass Linguistics. Slides. 
JM Ch 24, Section 1 

Thu
Nov 8 
Dirichlet Multinomial Distributions
Mathematics of Multinomial and Dirichlet distributions, Dirichlet as a smoothing for multinomials. Guest lecture by David Mimno. (No slides.)

JM Ch 20
(labeled as 19) 

Tue
Nov 13 
Project Proposals
Student groups give short presentations on their project idea. Feedback from the rest of class. 

HW#7 due. Last HW! 
Thu
Nov 15 
Machine Translation
Probabilistic models for translating French into English. Alignment, translation, language generation. IBM Model #1. Slides. 
JM Ch 24 

Tue
Nov 20 
Machine Translation 2
IBM Model #2, and Expectation Maximization. MT evaluation. (Continuation of previous slides.) 
JM Ch 24 

Thu
Nov 22 
NO CLASS (Thanksgiving Holiday) 


Tue
Nov 27 
Unsupervised Language Discovery
Automatically discovering verb subcategorization. Slides. 

Project progress report due. 
Thu
Nov 29 
Topic Models and Language in Social Networks
Topic models. Language modeling integrated into social network analysis. (Continuation of previous slides.) 


Tue
Dec 4 
Pragmatics
Guest lecture by Chris Potts, Professor, UMass Linguistics. Slides. 
JM Ch 21.3
Selected readings. 

Thu
Dec 6 
Information Extraction & Reference Resolution
Building a database of person & company relations from 10 years of New York Times. Building a database of job openings from 70k company Web pages. Various methods, including HMMs. Models of anaphora resolution. Machine learning methods for coreference. Slides1 Slides2 
JM Ch 22
Selected readings. 
Project presentation initial writeup due. 
Tue
Dec 11 
Project Presentations
Student groups present the results of their project. 

Project presentations 
Thu
Dec 13 
Project Presentations, and Wrapup Canceled due to snow.
Student groups present the results of their project. Broad overview, ties between computer science, statistics and linguistics. Upcoming research trends and capabilities. 

Project presentations 
Mon Dec 17 
Project Presentations, and Wrapup Makeup
Student groups present the results of their project. 


Wed Dec 19 8am 
FINAL EXAM in Rm 140, CS Building (our regular classroom).


Project presentation final writeup due midnight Wednesday Dec 19. 