Assignments

HW1, released 9/5, due 9/14 on Gradescope. [colab] [pdf]
HW2 - released 9/19, due Wed 10/1 on Gradescope.
HW3: Part 1 (colab), Part 2 (PDF); released 11/7. Due Mon 11/17 on Gradescope (both parts). PDF updated 11/10.
Seminar extra credit: Attend a talk from the 692L seminar series, or another NLP-related research talk at UMass approved by instructor. Write up the talk in several paragraphs, described in this google form; submit through that form. Talk write-ups count toward the exercise component of your grade; details in the Grading page.
- New requirement, added 11/18: A talk writeup must be submitted within 3 days of the talk itself. For talks from before 11/18, if you are still prepared to write a high quality writeup of the talk itself based on your attendance and viewing of it (not just based on a transcript), writeups are due 11/21.

Schedule

Make sure to reload this page to ensure you're seeing the latest version. Readings should be done before lecture. Note Jurafsky and Martin (JM) readings refer to the August 2025 edition of their online SLP3 draft.

Week 1: Introduction and Language Modeling

9/2: Course Introduction [slides]
9/4: N-Gram LMs [slides]
Reading: JM ch. 2, Words and Tokens, sections 2.1-2.2
Reading: JM ch. 3, N-gram LMs (all sections)
Reading: Bengio et al., 2003, "A Neural Probabilistic Language Model"
Optional: General blog post on language modeling (Lena Voita) (many of the topics, like generation, we'll return to later)
Optional: The Bitter Lesson (Sutton)

Week 2: Embeddings and Text Classification

9/9: Neural LMs and Word Embeddings [slides]
9/11: More Word Embeddings [slides], Text Classification [slides], sentiment keywords demo [colab]
Reading: see again Bengio et al. 2003
Reading: JM ch. 5, Embeddings
Reading: Landauer and Dumais, 1997, "Solution to Plato's Problem..." Psych. Review. Cognitive science perspective on learning word embeddings from text data. Experiments with an SVD-based embedding model, instead of the current neural-style logistic log-loss gradient learning approach.
Reading: JM ch. 4, Logistic Regression and Classification
Optional: Pang et al., EMNLP 2001 (supervised learning for sentiment classification)

Week 3: Neural Networks: MLPs, RNNs, Learning

9/16: Intro to Neural Networks [slides]
9/18: Backpropagation and NN Learning [slides]
- SGNS gradient math note, after lecture
Reading: JM ch. 6, Neural Networks
Reading: JM ch. 13, Recurrent NNs
Optional: Pascanu et al., ICML 2013, on vanishing gradients in RNNs
Optional: Andrej Karpathy blogpost (~2014?) with a coding-based explanation of backprop

Week 4: Large language models

9/24: Lecture on zoom (link from home page). Part 1: Course projects. Part 2: LLMs (ELMo and data)
9/26: Lecture cancelled (instructor travel)
Reading: JM ch. 7, Large Language Models

Week 5: Attention and Transformers

9/30: Transformers [slides]
10/2: More Transformers [slides]
Reading: JM ch. 8, Transformers
Optional readings on Transformers and Transformer-based generative LMs:
- Cosma Shalizi notes in Transformers, including the kernel smoothing viewpoint (Tsai et al., EMNLP 2019) and many varied references
- Blogpost: Illustrated GPT-2
- OpenAI's GPT line of work consists of a series of non-peer-reviewed tech reports, with progressively less technical detail; for example, GPT (Radford et al. 2013), GPT-2 (Radford et al. 2019), GPT-3 (Brown et al. 2020), and InstructGPT (Ouyang et al. 2022), the last of which was used in the original December 2022 version of ChatGPT.
- Blogpost: "Let's reproduce GPT-2", Karpathy, July 2024; short article and full reproduction of GPT-2 (to the extent it's possible), with implementation.
Optional readings on attention mechanisms:
- Attention mechanisms in NLP initially focused on machine translation, where naturally correspond to earlier "alignment"-based methods.
  - Bahdanau et al., ICLR 2015, Neural Machine Translation by Jointly Learning to Align and Translate
  - Luong et al., EMNLP 2015, Effective Approaches to Attention-based Neural Machine Translation
- Blog post on attention (Alammar)
- Video to understand self-attention (3Blue1Brown)

Week 6

Exercise #2 (due before 10/7 lecture): Load and run GPT-2 as a generator. (For example, follow HF docs' code snippets on Google Colab with a GPU enabled. Use any GPT-2 variant you prefer.) Experiment with different prompts (the left context before generation starts). Develop at least two prompts for (1) a factual question where the model makes a mistake, or does something wrong; (2) an open-ended question, or a prompt to start a story, news or social post, or other narrative text. For each prompt, generate at least twice from it, and comment on the quality of the result. Submit everything (for example, Colab PDF output, or a separate document as you prefer) to Gradescope, before the Tuesday 10/7 lecture. The purpose of this exercise is to try out a (non-instruction-tuned) Transformer LM; Come to lecture ready to discuss your experience with it.
10/7: Generative LLMs [slides]
Reading: JM ch. 7, Large Language Models (also assigned earlier, but we'll cover more of its material in lecture, in particular decoding)
Reading: Holtzmann et al., ICLR 2020, "The Curious Case of Neural Text Degeneration" (on nucleus sampling).
10/9: Midterm #1, in class. Practice midterm questions available on the Piazza Resources page.
Due next week: Project proposal, following these instructions / template.

Week 7

10/14: Fine-tuning LLMs [slides]
10/16: Annotations and tasks [slides]
Exercise #3 (annotations) [pdf]. Due on Gradescope before next lecture.
Reading: Eisenstein (INLP book), 4.5

Week 8: Instructions and alignment

10/21: [slides]
10/23: [slides]
Reading: JM ch. 9, Post-training
Optional: Lambert (2025), Reinforcement Learning from Human Feedback, online book/tutorial. The author also has done posts/videos/slidedecks on the topic; for example, their tutorial at ICML, summer 2023.
Optional: Ivison et al., NeurIPS 2024, Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback.
Optional: "Beyond base models - post-training in 2025" section within "The Smol Training Playbook", Hugging Face, Oct 30, 2025. Overviews more options and the authors' views of current best practices.

Week 9: Factuality and NLG Evaluation

10/28: NLI and Factuality [slides]
Exercise #4, Natural Language Inference (10/28 in-class activity). Submit PDF/scan on Gradescope by the 10/30 lecture.
Readings on NLI (added later): Bowman et al. 2015 (the original SNLI paper) and the SNLI webpage; the exercise examples are from Nie et al., 2020 (or in particular, this follow-up, Williams et al. 2020).
10/30: MT, Generation and Evaluation [slides]
Exercise #5, MT (10/30 in-class activity). Submit PDF/scan on Gradescope before the next lecture. It's sufficient if you only find 5 new word translation pairs.

Week 10

11/4: no class (Election Day)
11/6: Review of Midterm 1

Week 11: Reasoning

11/11: no class (Veteran's Day)
11/13: Reasoning [slides]
Reading: Sprague et al., ICLR 2025, "To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning."
Optional: Kojima et al., NeurIPS 2022, "Large Language Models are Zero-Shot Reasoners" (The "Let’s think step by step" paper, mentioned during lecture)
Optional: Wei et al., NeurIPS 2022, "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models."
Optional: DeepSeek-AI, 2025, the DeepSeek-R1 paper.
Optional: Lambert et al., 2025, the Tulu 3 paper.

Week 12

11/18: Review [slides on talks]
11/20: Midterm #2, in class

Week 13

11/25: lecture
11/27: no class (Thanksgiving)

Weeks 14 and 15: Final presentations

Final presentations will occur within these 3 dates

12/2
12/4
12/9

After end of classes

12/17 (end of finals week): Final projects due. Template/directions to be posted.