CS 685, Fall 2020, UMass Amherst
Schedule
Make sure to reload this page to ensure you're seeing the latest version.
Readings should be done before watching the corresponding lecture videos.
Week 1 (8/24-28): introduction, language models, representation learning
-
- Course introduction // [video] // [slides]
- No associated readings or weekly quiz!
-
- Language modeling // [video] // [slides]
- [reading] Jurafsky & Martin, 3.1-3.5 (language modeling)
- [reading] Jurafsky & Martin, 7 (neural language models)
- HW 0 released here, due 9/4 on Gradescope
Week 2 (8/31-9/4): neural LMs, RNNs, backpropagation
Week 3 (9/7-11): Attention mechanisms
- Quiz 2 released, due 9/18 on Gradescope
Week 4 (9/14-18): Transformers, transfer learning
- Transformers and sequence-to-sequence models // [video] // [slides] // [notes]
- [reading] An easy-to-read blog post on Transformers
- [reading] "Attention is All You Need": Transformers research paper (Vaswani et al., 2017)
- Transfer learning via neural language models // [stream] // [slides] // [notes]
- [reading] Deep contextualized word representations (Peters et al., 2018, "ELMo")
- [reading] Easy-to-read blog post on transfer learning in NLP
- Quiz 3 released, due 9/25 on Gradescope
Week 5: BERT and how to use it for downstream tasks
- Question answering // [stream] // [slides]
- [reading] SQuAD: 100000+ Questions for Machine Comprehension (Rajpurkar et al., 2016)
- [reading] ELI5: Long Form Question Answering (Fan et al., 2019)
Week 6: further improving transfer learning in NLP
- Quiz 4 released, due 10/9 on Gradescope
Week 7: improving text generation
- Brute force scaling of language models // [video] // [slides]
- [reading] Language models are few-shot learners: GPT-3 (Brown et al., 2020)
- [reading] ...On meaning, form, and understanding (Bender & Koller, 2020)
- [optional reading] Julian Michael's blog post on the Octopus Test
- [optional reading] Chris Potts' article on GPT-3 & the Bender & Koller paper
- Evaluating text generation models // [video] // [slides]
- [reading] BLEURT: robust metrics for text generation (Sellam et al., 2020)
- [reading] Do massively pretrained LMs make better storytellers? (See et al., 2019)
- Quiz 5 released, due 10/16 on Gradescope
Week 8: data augmentation and collection
- Paraphrase generation // [video] // [slides]
- [reading] Neural syntactic preordering for paraphrase generation (Goyal & Durrett, 2020)
- [reading] Adversarial examples via paraphrasing (Iyyer et al., 2018)
- Crowdsourced data collection // [video] // [slides]
- [reading] Annotation artifacts in NLI (Gururangan et al., 2018)
- [reading] Adversarial Examples for SQuAD (Jia et al., 2017)
Week 9: model distillation and retrieval-augmented LMs
- Model distillation // [video] // [slides]
- [reading] Imitation attacks on MT systems (Wallace et al., 2020)
- [reading] Thieves on Sesame Street! (Krishna et al., 2020)
- [reading] Lottery ticket hypothesis for BERT (Chen et al., 2020)
- [reading] Layer dropout for Transformers (Fan et al., 2019)
- Retrieval-augmented LMs // [video] // [slides]
- [reading] REALM: retrieval-augmented LMs (Guu et al., 2020)
- [reading] Nearest neighbor machine translation (Khandelwal et al., 2020)
Week 10: Transformer implementation, vision + language
- Vision + language // [video] // [slides]
- [reading] CEREALBAR: executing instructions in situated interactions (Suhr et al., 2019)
- [reading] Visual-semantic alignments for image captioning (Karpathy & Fei-Fei, 2014)
- [reading] NLVR: A corpus of natural language for visual reasoning (Suhr et al., 2017)
Week 11: Exam week!
- No class Wed 11/4, prepare for your exam!
Week 12: Ethics and probe tasks
- Linguistic probe tasks // [video] // [slides]
- [reading] What you can cram into a single $&!#* vector (Conneau et al., 2018)
- [reading] Control probes (Hewitt & Liang et al., 2019)
- Quiz 6 released, due 11/20 on Gradescope
Week 13: Semantic parsing and commonsense reasoning