CS 685, Spring 2022, UMass Amherst
Assignments
-
Homework 0 released, due 2/4 on Gradescope
- Quiz 1 released, due 2/22
-
Final project proposals due 2/25 on Gradescope, use this Overleaf template
- Quiz 2 released, due 3/7
- Homework 1 released, due 3/25 on Gradescope
- Quiz 3 released, due 3/11
-
Midterm released on 3/30, due 4/1 at 8AM on Gradescope
- Quiz 4 released, due 4/19
- Quiz 5 released, due 5/6
- Homework 2 released, due 5/4
-
Final project reports & code due 5/12, use this Overleaf template
Schedule
Make sure to reload this page to ensure you're seeing the latest version.
Readings should be done before watching the corresponding lecture videos. See this page for materials (videos / slides / reading) from the Fall 2021 offering.
Week 1 (1/26): introduction
-
- Course introduction // [video] // [slides]
- No associated readings or weekly quiz!
- HW 0 released here, due 2/4
Week 2 (1/31, 2/2): language models
Week 3 (2/7-9): backpropagation, attention mechanisms
Week 4 (2/14-16): Transformers
- Transformer language models // [video] // [slides] // [notes]
- [reading] Vaswani et al., NeurIPS 2017 (paper that introduced Transformers)
- [reading] An easy-to-read blog post on Transformer language models
- Quiz 1 released, due 2/22
- Transformers (cont'd) and transfer learning // [video] // [notes]
- [reading] Deep contextualized word representations (Peters et al., 2018, "ELMo")
- [reading] BERT: Pre-training of Deep Bidirectional Transformers... (Devlin et al., 2019)
- [reading] Easy-to-read blog post on transfer learning in NLP
Week 5 (2/22-23): Transfer learning
Week 6 (2/28-3/2): Text-to-text transfer and decoding
Week 7 (3/7-9): Prompt-based learning, evaluating text generation
- Evaluating text generation models // [video] // [slides]
- [reading] Evaluation of text generation survey (Celikyilmaz et al., 2020)
- [reading] BLEURT: robust metrics for text generation (Sellam et al., 2020)
- [optional reading] Do massively pretrained LMs make better storytellers? (See et al., 2019)
Week 8 (3/21-23): Multilingual LMs, retrieval-augmented LMs
- Multilingual transfer learning // [video] // [slides]
- [reading] Beyond English-centric multilingual machine translation (Fan et al., 2020)
- [reading] MAD-X: Multi-task cross lingual transfer (Pfeiffer et al., EMNLP 2020)
- Retrieval-augmented LMs // [video] // [slides]
- [reading] REALM: retrieval-augmented LMs (Guu et al., 2020)
- [reading] Nearest neighbor machine translation (Khandelwal et al., ICLR 2021)
- [optional reading] Hurdles to progress in long-form QA (Krishna et al., NAACL 2021)
Week 9 (3/28-30): Midterm exam
- No class on 3/30, midterm to be released at 8AM on 3/30, due 8AM on 4/1 via Gradescope
Week 10 (4/4-6): Efficient Transformers, commonsense reasoning
- Efficient / long-range Transformers // [video] // [slides]
- [reading] Survey of Efficient Transformers (Tay et al., 2020)
- [reading] Routing Transformers (Roy et al., TACL 2020)
- [optional reading] Do long-range LMs use long-range context? (Sun et al., EMNLP 2021)
Week 11 (4/11-13): Linguistic probe tasks, vision + language
- Linguistic probe tasks (no in-class lecture!) // [video] // [slides]
- [reading] What you can cram into a single $&!#* vector (Conneau et al., 2018)
- [reading] Control probes (Hewitt & Liang et al., 2019)
- Vision + language (no in-class lecture!) // [video] // [slides]
- [reading] Visual-semantic alignments for image captioning (Karpathy & Fei-Fei, 2014)
- [reading] Learning... visual models from NL supervision (Radford et al., 2021, "CLIP")
- [reading] NLVR: A corpus of natural language for visual reasoning (Suhr et al., 2017)
- Quiz 4 released, due 4/19
- HW 2 released here, due 5/4
Week 12 (4/20): Model distillation
- Model distillation // [video] // [slides]
- [reading] Imitation attacks on MT systems (Wallace et al., 2020)
- [reading] Thieves on Sesame Street! (Krishna et al., 2020)
- [reading] Lottery ticket hypothesis for BERT (Chen et al., 2020)
Week 13 (4/25-27): Syntactic and semantic parsing
Week 14 (5/2-4): Ethics in NLP & Scaling Laws of Large LMs
- Scaling laws // [video] // [slides]
- [reading] Scaling Laws for Neural Language Models (Kaplan et al., 2020)
- [reading] Training Compute-Optimal Large Language Models (Hoffmann et al., 2022)
- [reading] PaLM: Scaling Language Modeling with Pathways (Chowdhery et al., 2022)