CS 585, Fall 2016, UMass Amherst
HW1.ipynb (word statistics): due Friday 9/16 at midnight. (also in HTML format)
HW2.zip (n-gram LMs): due Friday 9/23 at midnight. (if desired, see HTML format)
HW3.pdf (Naive Bayes): due Friday 10/7 at midnight.
Also get nb.py (starter code),
and the data file,
which is around 46 MB to download, and takes around 210 MB when unzipped.
HW4: this is divided into two parts. The homework is much more substantial
than previous homeworks. Please get started early.
HW5.zip (distributional similarity): due Sunday 11/27. (also in HTML format)
HW6.pdf (research paper reading): due Friday 12/9.
Make sure to reload this page to ensure you’re seeing the latest version.
Readings should be done before the indicated class.
- HW0 is out. Due Friday 9/9 at 5pm on Moodle. (hw0.tex if you want it). Solutions on the Piazza "Resources" section.
Th 9/8 - Words
Tu 9/13 - N-gram Language Models
Th 9/15 - N-gram Language Models, cont'd
Tu 9/27 - Sentiment Lexicons (+ very brief neural networks intro) [slides]
Lecture is cancelled. Instead, go to the Yoshua Bengio talk in the
Statistical and Computational Data Science Distinguished Lecture Series.
2-4pm, CS Building room 150/151.
Talk video here.
The makeup reading is
Manning 2016, "Computational Linguistics and Deep Learning".
Th 10/6: Hidden Markov Models, Forward Algo [slides]
No class on Tuesday 10/11
Friday 10/14: Project proposals due
Tu 10/18: Structured Perceptron and Conditional Random Fields
Th 10/20: CRFs part 2 and project work
10/25: Syntax, Part 1 [slides]
- Exercise 5: Noun phrase CFG
- Reading: from J&M 2nd ed., 12.1-12.7 from Ch. 12, "Formal Grammars of English." This is a different edition than what we've used for the other readings.
See the Piazza resources page.
- Optional: Phillips (2003) is a great overview of syntax.
10/27: Syntax, Part 2 [slides]
- Exercise 6: CKY
- Reading: from J&M 2nd ed., Ch. 13, "Parsing with Context-Free Grammars."
Again see the Piazza resources page.
Tu 11/1: Review session
Th 11/3: Midterm
11/8: Lexical Semantics [slides]
11/10: Distributional Semantics [slides]
- JM 3rd ed:
(Vector Semantics) and
(Semantics with Dense Vectors)
Sunday 11/13: Progress report due
11/15: Word Embeddings and Neural Networks [slides]
11/17: Relationship Extraction, Neural Networks, and Matrix Factorization
Guest lecture, Haw-Shiuan Chang
No class 11/22 or 11/24: Thanksgiving
11/29: Machine Translation [slides]
(Lecture was replaced): EM and latent variable models
- EM for IBM Model 1: J&M 2nd edition, 25.6
- EM for a latent-variable Markov model: Saul and Pereira (1997), "Aggregate and mixed-order Markov models for statistical language processing"
- Optional (EM to refute Chomsky): Pereira (2000), "Formal grammar and information theory: together again?"
- Optional: Blei (2012), "Probabilistic Topic Models"
12/1: Information Extraction and Coreference [slides]
12/6: Research and summarization (Abe)
12/8: Social factors and ethics in NLP
Tu 12/13: Poster session, 2:30-4:30, room CS 150/151