Philip S. Thomas · CMPSCI 390A: Introduction to Machine Learning

CMPSCI 390A: Introduction to Machine Learning

Spring 2021, University of Massachusetts

Lecture Times: Tuesdays and Thursdays, 11:30am-12:45pm Eastern

Course Information

Lecture:11:30am-12:45pm Tuesdays and Thursdays

Zoom link: https://umass-amherst.zoom.us/j/94525449604

Description

The course provides an introduction to machine learning algorithms and applications. Machine learning algorithms answer the question: "How can a computer improve its performance based on data and from its own experience?" The course is roughly divided into thirds: supervised learning (learning from labeled data), reinforcement learning (learning via trial and error), and real-world considerations like ethics, safety, and fairness. Specific topics include linear and non-linear regression, (stochastic) gradient descent, neural networks, backpropagation, classification, Markov decision processes, state-value and action-value functions, temporal difference learning, actor-critic algorithms, the reward prediction error hypothesis for dopamine, connectionism for philosophy of mind, and ethics, safety, and fairness considerations when applying machine learning to real-world problems.

Download Course Notes .pdf

TAs and Office Hours

The TAs for this course are Cooper Sigrist (csigrist@umass.edu) and Scott Jordan (sjordan@cs.umass.edu). Scott will primarily be handling assignments and grading, and so you should ask him questions related to grading. Cooper will primarily be holding office hours.

Office hours will be at the following times:

Day	Time	Person	Link
Monday	8:00am-9:45am	Cooper Sigrist	link
Tuesday	1:00pm-3:00pm	Cooper Sigrist	link
Wednesday	4:00pm-6:00pm	Philip Thomas	link
Thursday	8:00am-9:45am	Cooper Sigrist	link
Friday	1:00pm-3:00pm	Cooper Sigrist	link

Office hours will follow the UMass Academic Calendar [link]. For example, Monday March 1 will follow a Wednesday schedule, and so Philip Thomas will be holding office hours and Cooper Sigrist will not. Office hours will run up to and including the last day of classes, May 4.

Assignments

Homework 1 has been assigned on 2 February 2021 and is due at 11:00am on 4 February 2021. [lin (.pdf)]
Homework 2 has been assigned on 8 February 2021 and is due at 11:00am on 11 February 2021. [link (.zip)]
Homework 3 has been assigned on 11 February 2021 and is due at 11:00am on 18 February 2021. [link (.zip)]
Homework 4 has been assigned on 19 February 2021 and is due at 11:00am on 25 February 2021. [link (.zip)]
Homework 5 has been assigned on 5 March 2021 and is due at 11:00am on 16 March 2021. [link (.zip)]
Homework 6 has been assigned on 8 April 2021 and is due at 11:00am on 20 April 2021. [link (.zip)]
Homework 7 has been assigned on 22 April 2021 and is due at 11:00am on 29 April 2021. [link (.zip)]

Schedule

Part I: Supervised Learning

Lecture	Topic	Reading	Whiteboard
1	Introduction	Chapter 1 (course notes)	link
2	Regression, k-Nearest Neighbors, Linear Regression I	Chapter 2 (course notes)	link
3	Linear Regression II	Chapter 3	link
4	Linear Regression III, Gradient Descent	Chapter 4	link
5	Gradient Descent (continued)	Chapter 5	link
6	Basis functions, feature normalization, perceptrons	Chapter 6	link
7	Perceptrons	Chapter 7	link
8	Artificial Neural Networks	Chapter 8	link
9	Backpropagation	Chapter 9	link
10	Supervised Learning - Other Topics	Chapter 10	link

Part II: Reinforcement Learning

Lecture	Topic	Reading	Slides
11	Introduction	Chapter 11	link
12	MENACE, Notation, and Problem Formulation	Chapter 12	link
13	Episodes and Policy Representations	Chapter 13	link
14	Midterm Solutions and Linear Softmax Policies	Chapter 13 (linear softmax content added)	link
15	MENACE-like RL Algorithm	Chapter 14	link
16	Value functions and TD error	Chapter 15	link
17	Review	No readings	No whiteboard
18	Actor-Critics, Options, and Off-Policy Evaluation	Chapter 16	link

Part III: Ethics, Safety, Fairness, and Connections to other Areas

Topic	Reading	Slides
Connections to psychology and neuroscience	Sutton and Barto Chapters 14 and 15 [link] A Neural Substrate of Prediction and Reward [link] Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control [link] Gero Miesenboeck TED Talk [link]	link
Fairness, Accountability, and Transparency	Slides posted on Moodle	No whiteboard
Philosophy of Mind	Slides posted on Moodle	No whiteboard
Ethics	See Moodle for Google Docs	No whiteboard
Ethics and Safety	See Moodle for Google Docs	link
Final Exam Review	See Moodle for Google Docs	link