CS 585, Fall 2017, UMass Amherst
Homeworks
-
HW0: due Friday, Sept 8, on Gradescope.
(If you like: hw0.tex source)
- HW1: due at 11PM on Friday, Sept 22, on Gradescope.
You will need to download two files hw_1.py and hw_1.ipynb. You will also need to download
the data file, which is around 46 MB to download, and takes around 210 MB when unzipped.
(For reference: html format)
- HW2: due at 11pm on Tuesday, Oct 10. (For reference: html format)
- HW3: due at 11pm on Friday, Nov 11. Download hw3.zip (26 MB).
(For reference: see html format)
- HW4: due at 11pm on Friday, Dec 15. Download hw4.zip.
(For reference: see html format)
Schedule
Make sure to reload this page to ensure you’re seeing the latest version.
Readings should be done before the indicated class.
Texts include:
- JM = Jurafsky and Martin, Speech and Language Processing, 3rd edition draft chapters
- Eis. = Eisenstein draft text
Tue 9/5 - Introduction [slides]
Tue 9/12 - N-Gram Language Models [slides]
Thu 9/14 - Classification: Naive Bayes
[slides]
Tue 9/19 - Classification: Evaluation and Annotation [slides], [scan]
Thu 9/21 - Classification: Logistic regression [slides]
- In our class we'll use this perspective: logistic regression is a probabilistic model for classification.
The perceptron algorithm is one way to learn its parameters from labeled training data.
(Maximum likelihood, like we saw for LMs, NB and will for HMMs, is a different learning method.)
- JM ch. 7, Logistic Regression
- Daume ch. 3, The Perceptron
Thu 9/28 - Sequence Tagging: Viterbi [scan]
Tue 10/3 - Project Discussion [slides]
Tue 10/3 - Log-linear Perceptron [slides]
- Daume chapter on the perceptron (above) - esp. averaged perceptron.
- Eisenstein text, 6.5, "Discriminative sequence labeling" up to 6.5.1, "Structured Perceptron." Read earlier sections in ch. 6 as necessary.
- From 10/4 office hours: NB as log-linear
Thu 10/5 - Conditional Random Fields [slides]
- CRF example
- Eisenstein text, 6.5.3, "Conditional Random Fields"
- Optional: Eisenstein text, ch. 7, "Applications of sequence labeling"
- Optional: Chen (2012)'s blogpost on CRFs (unfortunately its equations look broken).
Tue 10/10 - No Class
(Monday schedule day)
Thu 10/12 - Project proposal info [slides]
Tue 10/17 - WordNet [slides]
Thu 10/19 - Sentiment lexicons [slides]
Tue 10/24 - Midterm review [scan]
Thu 10/26 - Midterm
Tue 10/31, Thu 11/2 - Distributional Semantics [slides]
Tue 11/7 (cancelled)
Thu 11/9 - Unlabeled data in NLP [slides]
Tue 11/14 - Syntax, phrase structures I [slides]
Thu 11/16 - Syntax, phrase structures II [slides]
- Eisenstein text, 9.1-9.3, Context-free parsing
- Optional: JM Ch. 12, Syntactic parsing
Thanksgiving break - no class
Tue [11/28](18-deps),
11/30
- Dependency Syntax
- JM 14.1-14.4
- Optional: Eisesntein text, Ch. 10, Dependency parsing
Tue 12/5 - Coreference
Thu 12/7 - Generation: Translation and Summarization
Tue 12/12 - Poster session
3:30-6:30pm in CS room 150/151! Divided into two sessions: 3:30-5 and 5-6:30. See information on Piazza.