CS 685, Spring 2024, UMass Amherst
Assignments
Schedule
Make sure to reload this page to ensure you're seeing the latest version.
Readings should be done before watching the corresponding lecture videos. See this page for materials (videos / slides / reading) from the Spring 2023 offering.
Week 1 (2/5-7): introduction, language modeling
-
- Course introduction // [video] // [slides]
- No associated readings or weekly quiz!
- HW 0 released here, due 2/16
- Final projects:
-
Week 2 (2/12-14): neural language models, backpropagation
Week 3 (2/21-22): attention mechanisms and Transformers
Week 4 (2/26-28): Transformers (cont'd): architecture, pretrain/finetune
- Transformers (cont'd) // [video] // [notes]
- [reading] Deep contextualized word representations (Peters et al., 2018, "ELMo")
- [reading] BERT: Pre-training of Deep Bidirectional Transformers... (Devlin et al., 2019)
- [reading] Easy-to-read blog post on transfer learning in NLP
- BERT + Instruction tuning // [video] // [notes]
- [reading] BERT: Pre-training of Deep Bidirectional Transformers... (Devlin et al., 2019)
- [reading] Exploring the Limits of Transfer Learning... (Raffel et al., JMLR 2020, "T5")
- [reading] Instruction tuning (Wei et al., 2022, FLAN)
Week 5 (3/4-6): Tokenization and efficient fine-tuning
- Tokenization & T5 // [video] // [slides] // [notes]
- [reading] Neural Machine Translation... with Subword Units (Sennrich et al., ACL 2016)
- [reading] ByT5: Towards a token-free future... (Xue et al., 2021)
- Parameter-efficient adaptation // [video] // [notes]
- [reading] Power of Scale for Prompt Tuning (Lester et al., EMNLP 2021)
- [reading] LoRA: Low-Rank Adaptation of Large Language Models (Hu et al., 2021)
Week 6 (3/11-13): LLM alignment
Week 7 (3/27): Decoding from language models
- No class Monday 3/25 (Mohit traveling)