Outline is subject to change

Intro

Calculus review

Paradigm: supervised learning
Linear regression
- Setup
- Cost function
- Minimize cost function (one parameter—slope only) by setting derivative to zero
Gradient descent
Geometry of functions in higher dimensions
- Contour plots
GD for linear regression with two parameters (slope, intercept)
- Intuition of partial derivatives
- Partial derivatives of the linear regression cost function

Interactive session to show basic / important MATLAB features
From Brown tutorial:
1. Basics
  - Comments
  - Suppressing output
  - Statements separated by commas, semicolons, or newlines
  - help / doc
2. Types, assignments, literals
  - Entering vectors and matrices
  - Accessing entries and submatrices
3. Operations on vectors and matrices
  - Elementwise operators and funcitons
  - Other vector and matrix functions
4. Control flow (briefly)
5. Functions
  - Multiple inputs / output
  - Assigment of multiple outputs
Did not cover
1. Debugging
2. Plotting
3. Load and save
4. Formatting strings: disp / sprintf / fprintf
Advanced topics (to cover in future classes)
- logical indexing
- cell arrays
- structs

Exercises
- Partial derivatives
- Intuition and geometry
Problems
- 1D linear regression derivations
- Implement 1D linear regression by gradient descent
- Convergence of gradient descent
- Run 1D linear regression on own data
Follow-up notes
- Feature normalization
- Gradient descent: simultaneous updates of all parameters

Motivation: want to move to more complicated ML setups
- Many inputs \(x_1, \ldots, x_n\)
- More complex functions, e.g. polynomials
Linear algebra
- Succinct language for linear expressions of many variables
- Saves coding
- Inspires new ML methods
Matrices
Vectors
Matrix-Matrix multiplication (and special cases)
Tranpose
Inverse

MATLAB pointers
- Concatenation of vectors / matrices
- Subscripted assignment
First multivariate prediction models
Geometry of linear functions in high dimensions
- “Tilted” planes through the origin
  - (Affine function = linear function translated away from origin)
- Contours are parallel lines
- Gradient is vector orthogonal to contours
- Length of gradient = “slope” of plane
Multivariate linear regression
- Motivation
- Model
- Cost function
- Normal equations
- Gradient descent
Features
- Normalization
- Feature design
- Non-linearity by feature expansion
- Polynomial regression

Different evaluation goals
- Estimate performance of deployed system
- Model selection
- Compare algorithms
Data splits
- Train / validation / test
- Cross-validation
Classification performance measures
- Accuracy
- Confusion matrix
- Precision
- Recall
- F1
- Precision-recall curve
Tools
- Grid search
- Training curve
- Precision-recall curve

Guest lecture by Kevin Winner, UMass Ph.D. student
Learn about Weka
- Comprehensive ML toolkit
- Easy to use
- Java based: both interactive GUI and programmatic API