Machine Learning
CS 689, Spring 2023, UMass Amherst CS
Instructor: Brendan Oâ€™Connor, brenocon AT cs.umass.edu
All materials accessible via the Piazza page for this course.
Syllabus
Course Description: Machine learning is the computational study of artificial systems that can adapt to novel situations, discover patterns from data, and improve performance with practice. This course will cover the mathematical foundation of supervised and unsupervised learning. The course will provide a stateoftheart overview of the field, with an emphasis on implementing and deriving learning algorithms for a variety of models from first principles.
Overview: The information below is designed to help you decide whether this is the right course for you at this time and how to be successful in the course.
Machine learning at the PhD level aims to prepare students to participate in machine learning research. It requires both strong mathematical foundations and the ability to implement algorithms with a high degree of precision and computational efficiency. Specifically, the course requires a solid undergraduatelevel background in linear algebra, vector calculus, multivariate probability, and numerical programming in Python. Students who need to acquire or substantially revise this background material should plan to spend significant additional time on assignments.
The three main steps you can take to succeed in the course are:
 Make sure 689 is the right course for you and this is the right time to take it. MS students with no background in ML are strongly encouraged to take 589 prior to taking 689, unless they have extremely strong backgrounds in
all areas (e.g., dual major in Math and CS, undergrad in CS and prior MS in math). MS/PhD students who are interested in applied machine learning are also strongly encouraged to take 589 before (or instead of) taking 689. 589 counts as a 500level elective for MS/PhD students. MS students who want experience with the mathematical foundations of machine learning and MS/PhD students who plan to conduct research in ML or related area (vision, NLP, AI, etc.) should take 689.

Set up your schedule to accommodate the course. All students are strongly advised against taking 689 in combination with any other PhDlevel core course unless they have extremely strong backgrounds in all areas. You can makeup gaps in background at the same time you learn primary course material, but you will need to be prepared to devote extra time to the course to do so.

Start addressing weaknesses in you background now. 689 starts with the assumption that you have sufficient background knowledge of linear algebra, vector calculus, multivariate probability, and Python, and will integrate aspects of these topics together from the outset (e.g., using differential calculus to derive a method for optimizing the parameters of a multivariate probability density over a vector space and then implementing the method in Python). The course does not cover background topics, but to help you prepare we have assembled a reading list that covers what you need to know to get started in the course. Reviewing all of the material below with a focus on weaker areas is a good strategy for all students. The specific sources below may cover material at a deeper level than is included in some undergrad CS programs (for example, computational complexity of linear algebra operations).
Suggested Reading List:
Covering the math in the order listed below is likely to be most helpful. For calculus, Corral or Marsden and Tromba can be used. Marsden and Tromba is more detailed, but Corral will do. All texts are open access or freely available through the UMass Library (links provided), except for Marsden and Tromba.
Students should feel free to discuss background material among themselves on Piazza using the background tag.
 Zico Kolter. Linear Algebra Review and Reference (2008 version), and also videos. Sometimes very brief, but covers most of the necessary linear algebra and multivariate calculus topics in this course.

Stephen Boyd and Lieven Vandenberghe. Introduction to Applied Linear Algebra.
 Chapter 1: Vectors
 Chapter 2.1: Linear Functions
 Chapter 3: Norm and Distance
 Chapter 5: Linear Independence
 Chapter 6: Matrices
 Chapter 8: Linear Equations (Can skip 8.2)
 Chapter 10: Matrix Multiplication
 Chapter 11: Matrix Inverses

Stephen Boyd and Lieven Vandenberghe. Convex Optimization. (Covers additional linear algebra background missing from the Applied text)
 Appendix A.1, A.3, A.4, A.5
 Appendix C.1, C.2, C.3, C.4

Michael Corral. Vector Calculus.
 Chapter 1: Vectors in Euclidean Space (1.1 to 1.6, 1.8)
 Chapter 2: Functions of Several Variables (2.1 to 2.5)
 Chapter 3: Double Integrals (3.1, 3.3, 3.4, 3.7)

Marsden and Tromba. Vector Calculus
 Chapter 1: Geometry of Euclidean Space (1.1, 1.2, 1.3, 1.5)
 Chapter 2: Differentiation (2.1, 2.2, 2.3, 2.5, 2.6)
 Chapter 3: Higher Order Derivatives (3.1, 3.3)
 Chapter 4: Vector Valued Functions (4.1)
 Chapter 5: Double and Triple Integrals (5.1, 5.2, 5.5)

Bishop. Pattern Recognition and Machine Learning (probability from an ML perspective)
 Chapter 1: Introduction (1.2)
 Chapter 2: Probability Distributions (2.1, 2.2, 2.3, 2.4)

Murphy. Machine Learning: A Probabilistic Perspective (more probability from an ML perspective)

Scipy Lecture Notes (Python background)