Natural Language Processing
CMPSCI 585 Home

Course description

Textbook & Resources

Syllabus & Slides

Policies & Grading

Introduction to Natural Language Processing

CMPSCI 585 — Fall 2007


JM = Jurafsky & Martin "Speech and Language Processing"
MS = Manning and Schutze "Foundations of Statistical Natural Language Processing"

Date Topics Readings


Sep 4
Introduction and Overview
Welcome, motivations, what is Natural Language Processing, hands-on demonstrations. Ambiguity and uncertainty in language. The Turing test. Course outline and logistics. Questionaire. Handout. Slides.

JM Ch 1
MS Ch 1, historical overview.

Sep 6
Regular Expressions
Chomsky hierarchy, regular languages, and their limitations. Finite-state automata. Practical regular expressions for finding and counting language phenomena. A little morphology. In class demonstrations of exploring a large corpus with regex tools. Slides.
JM Ch 2 Install Python. HW#1 out: RegEx on corpora. Tools.
Sep 11
Programming in Python
An introduction to programming in Python. Why Python? Variables, numbers, strings, arrays, dictionaries, conditionals, iteration. The NLTK (Natural Language Toolkit), with demonstrations. Slides.
Refer to online programming resources, and Learning Python, at your own pace.  

Sep 13

String Edit Distance and Alignment
Key algorithmic tool: dynamic programming, first a simple example, then its use in optimal alignment of sequences. String edit operations, edit distance, and examples of use in spelling correction, and machine translation. Slides.
JM Ch 3.11
Optional extras: web

HW#1 due.

HW#2 out: String edit distances

Sep 18
Context Free Grammars
Constituency, CFG definition, use and limitations. Chomsky Normal Form. Top-down parsing, bottom-up parsing, and the problems with each. The desirability of combining evidence from both directions. Slides.
JM Ch 13.1-13-.3  
Sep 20
Non-probabilistic Parsing
Efficient CFG parsing with CYK, another dynamic programming algorithm. Also, perhaps, the Earley parser. Designing a little grammar, and parsing with it on some test data. Slides.
JM Ch 13.4

HW#2 due.

HW#3 out: Designing a little grammar, and parsing with CYK.

Sep 25
Introduction to probability theory--the backbone of modern natural language processing. Events, and counting. Joint and conditonal probability, marginals, independence, Bayes rule, combining evidence. Examples of applications in natural language. (Plus: use a little calculus!?) Slides.
Sep 27
Information Theory
What is information? Measuring it in bits. The "noisy channel model." The "Shannon game"--motivated by language! Entropy, cross-entropy, information gain. Its application to some language phenomena. Slides.
JM Ch 4.10-4.11

HW#3 due.

Oct 2

Information Theory, continued
Including helpful a quiz.
Oct 4
Language modeling and Naive Bayes
Probabilistic language modeling and its applications. Markov models. N-grams. Estimating the probability of a word, and smoothing. Generative models of language. Their application to building an automatically-trained email spam filter, and automatically determining the language (English, French, German, Dutch, Finnish, Klingon?). Slides.
JM Ch 4.1-4.9

HW#4 out: Choice: Building a spam filter, or language id

Oct 9
NO CLASS (This Tuesday follows UMass Monday schedule.) Optional midterm review session 5:00pm.    
Oct 11




Oct 16
Part of Speech Tagging and Hidden Markov Models
The concept of parts-of-speech, examples, usage. The Penn Treebank and Brown Corpus. Probabilistic (weighted) finite state automata. Hidden Markov models (HMMs), definition and use. Slides.
JM Ch 5 HW#4 due. HW#5 out: Build a part-of-speech tagger.
Oct 18
Viterbi Algorithm for Finding Most Likely HMM Path
Dynamic programming with Hidden Markov Models, and its use for part-of-speech tagging, Chinese word segmentation, prosody, information extraction, etc. (No slides, board work only.)
JM Ch 6.1-6.4  
Oct 23
Probabilistic Context Free Grammars
Weighted context free grammars. Weighted CYK. Pruning and beam search. Slides.
JM Ch 12  
Oct 25
Parsing with PCFGs
A treebank and what it takes to create one. The probabilistic version of CYK. Also: How do humans parse? Experiments with eye-tracking. Modern parsers. Slides.
JM Ch 13

HW#5 due Friday Oct 26.

HW#6 out: Build a Weighted PCFG for a little language.

Oct 30
Maximum Entropy Classifiers
The maximum entropy principle, and its relation to maximum likelihood. The need in NLP to integrate many pieces of weak evidence. Maximum entropy classifiers and their application to document classification, sentence segmentation, and other language tasks. Slides.
JM Ch 6.6-6.7  
Nov 1
Maximum Entropy Markov Models & Conditional Random Fields
Part-of-speech tagging, noun-phrase segmentation and information extraction models that combine maximum entropy and finite-state machines. State-of-the-art models for NLP. Guest lecture by Greg Druck. Slides.
JM Ch 6.8

HW#6 due.

HW#7 out: Build and apply a maximum entropy classifier.

Nov 6
Lexical Semantics
Guest lecture by Chris Potts, Professor, UMass Linguistics. Slides.
JM Ch 24, Section 1  
Nov 8

Dirichlet Multinomial Distributions
Mathematics of Multinomial and Dirichlet distributions, Dirichlet as a smoothing for multinomials. Guest lecture by David Mimno. (No slides.)

JM Ch 20
(labeled as 19)
Nov 13
Project Proposals
Student groups give short presentations on their project idea. Feedback from the rest of class.
  HW#7 due. Last HW!
Nov 15
Machine Translation
Probabilistic models for translating French into English. Alignment, translation, language generation. IBM Model #1. Slides.
JM Ch 24  
Nov 20
Machine Translation 2
IBM Model #2, and Expectation Maximization. MT evaluation. (Continuation of previous slides.)

JM Ch 24

Nov 22
NO CLASS (Thanksgiving Holiday)    
Nov 27
Unsupervised Language Discovery
Automatically discovering verb subcategorization. Slides.
  Project progress report due.
Nov 29
Topic Models and Language in Social Networks
Topic models. Language modeling integrated into social network analysis. (Continuation of previous slides.)
Dec 4
Guest lecture by Chris Potts, Professor, UMass Linguistics. Slides.
JM Ch 21.3
Selected readings.
Dec 6
Information Extraction & Reference Resolution
Building a database of person & company relations from 10 years of New York Times. Building a database of job openings from 70k company Web pages. Various methods, including HMMs. Models of anaphora resolution. Machine learning methods for coreference. Slides1 Slides2
JM Ch 22
Selected readings.
Project presentation initial write-up due.
Dec 11
Project Presentations
Student groups present the results of their project.
  Project presentations
Dec 13
Project Presentations, and Wrap-up Canceled due to snow.
Student groups present the results of their project. Broad overview, ties between computer science, statistics and linguistics. Upcoming research trends and capabilities.
  Project presentations
Mon Dec 17 Project Presentations, and Wrap-up Make-up
Student groups present the results of their project.
Wed Dec 19 8am FINAL EXAM in Rm 140, CS Building (our regular classroom).
  Project presentation final write-up due midnight Wednesday Dec 19.