CS 685, Fall 2025, UMass Amherst
Assignments
- HW1, released 9/5, due 9/14 on Gradescope.
[colab]
[pdf]
- HW2 - released 9/19, due Wed 10/1 on Gradescope.
- HW3: Part 1 (colab), Part 2 (PDF); released 11/7. Due Mon 11/17 on Gradescope (both parts). PDF updated 11/10.
- Seminar extra credit: Attend a talk from the 692L seminar series, or another NLP-related research talk at UMass approved by instructor. Write up the talk in several paragraphs, described in this google form; submit through that form. Talk write-ups count toward the exercise component of your grade; details in the Grading page.
- New requirement, added 11/18: A talk writeup must be submitted within 3 days of the talk itself. For talks from before 11/18, if you are still prepared to write a high quality writeup of the talk itself based on your attendance and viewing of it (not just based on a transcript), writeups are due 11/21.
Schedule
Make sure to reload this page to ensure you're seeing the latest version.
Readings should be done before lecture. Note Jurafsky and Martin (JM) readings refer to the August 2025 edition of their online SLP3 draft.
Week 1: Introduction and Language Modeling
Week 2: Embeddings and Text Classification
Week 3: Neural Networks: MLPs, RNNs, Learning
Week 4: Large language models
Week 5: Attention and Transformers
- 9/30: Transformers [slides]
- 10/2: More Transformers [slides]
- Reading: JM ch. 8, Transformers
- Optional readings on Transformers and Transformer-based generative LMs:
- Cosma Shalizi notes in Transformers, including the kernel smoothing viewpoint (Tsai et al., EMNLP 2019) and many varied references
- Blogpost: Illustrated GPT-2
- OpenAI's GPT line of work consists of a series of non-peer-reviewed tech reports, with progressively less technical detail; for example, GPT (Radford et al. 2013), GPT-2 (Radford et al. 2019),
GPT-3 (Brown et al. 2020),
and InstructGPT
(Ouyang et al. 2022),
the last of which was used in the original December 2022 version of ChatGPT.
- Blogpost: "Let's reproduce GPT-2", Karpathy, July 2024; short article and full reproduction of GPT-2 (to the extent it's possible), with implementation.
- Optional readings on attention mechanisms:
- Attention mechanisms in NLP initially focused on machine translation, where naturally correspond to earlier "alignment"-based methods.
- Blog post on attention (Alammar)
- Video to understand self-attention (3Blue1Brown)
Week 6
- Exercise #2 (due before 10/7 lecture): Load and run GPT-2 as a generator. (For example, follow HF docs' code snippets on Google Colab with a GPU enabled. Use any GPT-2 variant you prefer.) Experiment with different prompts (the left context before generation starts). Develop at least two prompts for (1) a factual question where the model makes a mistake, or does something wrong; (2) an open-ended question, or a prompt to start a story, news or social post, or other narrative text. For each prompt, generate at least twice from it, and comment on the quality of the result. Submit everything (for example, Colab PDF output, or a separate document as you prefer) to Gradescope, before the Tuesday 10/7 lecture. The purpose of this exercise is to try out a (non-instruction-tuned) Transformer LM; Come to lecture ready to discuss your experience with it.
- 10/7: Generative LLMs [slides]
- Reading: JM ch. 7, Large Language Models (also assigned earlier, but we'll cover more of its material in lecture, in particular decoding)
- Reading: Holtzmann et al., ICLR 2020, "The Curious Case of Neural Text Degeneration" (on nucleus sampling).
- 10/9: Midterm #1, in class. Practice midterm questions available on the Piazza Resources page.
- Due next week: Project proposal, following these instructions / template.
Week 7
Week 8: Instructions and alignment
- 10/21: [slides]
- 10/23: [slides]
- Reading: JM ch. 9, Post-training
- Optional: Lambert (2025), Reinforcement Learning from Human Feedback, online book/tutorial. The author also has done posts/videos/slidedecks on the topic; for example, their tutorial at ICML, summer 2023.
- Optional: Ivison et al., NeurIPS 2024, Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback.
- Optional: "Beyond base models - post-training in 2025" section within "The Smol Training Playbook", Hugging Face, Oct 30, 2025. Overviews more options and the authors' views of current best practices.
Week 9: Factuality and NLG Evaluation
Week 10
Week 11: Reasoning
- 11/11: no class (Veteran's Day)
- 11/13: Reasoning [slides]
- Reading: Sprague et al., ICLR 2025, "To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning."
- Optional: Kojima et al., NeurIPS 2022, "Large Language Models are Zero-Shot Reasoners" (The "Let’s think step by step" paper, mentioned during lecture)
- Optional: Wei et al., NeurIPS 2022, "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models."
- Optional: DeepSeek-AI, 2025, the DeepSeek-R1 paper.
- Optional: Lambert et al., 2025, the Tulu 3 paper.
Week 12
Week 13
Weeks 14 and 15: Final presentations
Final presentations will occur within these 3 dates
After end of classes
- 12/17 (end of finals week): Final projects due. Template/directions to be posted.