CS 685, Fall 2025, UMass Amherst
Assignments
- HW1, released 9/5, due 9/14 on Gradescope.
[colab]
[pdf]
- HW2 - released 9/19, due Wed 10/1 on Gradescope.
- HW3 - TBA
- Seminar extra credit: Attend a talk from the 692L seminar series, or another NLP-related research talk at UMass approved by instructor. Write up the talk in several paragraphs, described in this google form; submit through that form. Talk write-ups count toward the exercise component of your grade.
Schedule
Make sure to reload this page to ensure you're seeing the latest version.
Readings should be done before lecture. Note Jurafsky and Martin (JM) readings refer to the August 2025 edition of their online SLP3 draft.
Week 1: Introduction and Language Modeling
Week 2: Embeddings and Text Classification
Week 3: Neural Networks: MLPs, RNNs, Learning
Week 4: Large language models
Week 5: Attention and Transformers
- 9/30: Transformers [slides]
- 10/2: More Transformers [slides]
- Reading: JM ch. 8, Transformers
- Optional readings on Transformers and Transformer-based generative LMs:
- Optional readings on attention mechanisms:
- Attention mechanisms in NLP initially focused on machine translation, where naturally correspond to earlier "alignment"-based methods.
- Blog post on attention (Alammar)
- Video to understand self-attention (3Blue1Brown)
Week 6
- Exercise #2 (due before 10/7 lecture): Load and run GPT-2 as a generator. (For example, follow HF docs' code snippets on Google Colab with a GPU enabled. Use any GPT-2 variant you prefer.) Experiment with different prompts (the left context before generation starts). Develop at least two prompts for (1) a factual question where the model makes a mistake, or does something wrong; (2) an open-ended question, or a prompt to start a story, news or social post, or other narrative text. For each prompt, generate at least twice from it, and comment on the quality of the result. Submit everything (for example, Colab PDF output, or a separate document as you prefer) to Gradescope, before the Tuesday 10/7 lecture. The purpose of this exercise is to try out a (non-instruction-tuned) Transformer LM; Come to lecture ready to discuss your experience with it.
- 10/7 lecture: Generative LLMs
- Reading: JM ch. 7, Large Language Models (also assigned earlier, but we'll cover more of its material in lecture, in particular decoding)
- Reading: Holtzmann et al., ICLR 2020, "The Curious Case of Neural Text Degeneration" (on nucleus sampling).
- 10/9: Midterm #1, in class. Practice midterm questions available on the Piazza Resources page.
Other later dates for the course
- Mid-November: Midterm #2, in class
- 12/2, 12/4, 12/9: Final group presentations (possibly a subset of those days)
- 12/17 (end of finals week): Final projects due
Other readings
Some possible readings for later topics, or possible later topics, in the course.
Misc