Overview

How does Netflix learn what movies a person likes? How do computers read handwritten addresses on packages, or detect faces in images? Machine learning is the practice of programming computers to learn and improve through experience, and it is becoming pervasive in technology and science. This course will cover the mathematical underpinnings, algorithms, and practices that enable a computer to learn. Topics will include supervised learning, unsupervised learning, evaluation methodology, and Bayesian probabilistic modeling. Students will learn to program in MATLAB and apply course skills to solve real-world prediction and pattern recognition problems. Programming intensive.

Instructor Dan Sheldon
sheldon (at) cs (dot) umass (dot) edu
Lecture Tuesday, Thursday 10:00 am–11:15 am
Fourth Hour Friday 10:00 am–10:50 am
Location Clapp 218
Piazza https://piazza.com/mtholyoke/fall2014/cs335/home
Moodle https://moodle.mtholyoke.edu/course/view.php?id=6633
Textbook none
Office Hours Tuesday 4–5, Thursday 3:30–4:30, Clapp 222B, or by appointment

Prerequisites

The goal of these prerequisites is to ensure that you are: comfortable programming in some language; familiar with basic CS paradigms; know elementary probability and calculus; and are generally comfortable with mathematical tools and reasoning.

Resources

There is no required textbook for this course. Here are some useful resources:

  1. Introduction to Machine Learning by Alpaydin: a very approachable undergraduate machine learning text.
  2. Artificial Intelligence: A Modern Approach by Russell and Norvig: the most widely-used AI textbook. Chapters 13–15, 18, and 20 cover material related to machine learning.
  3. Pattern Recognition and Machine Learning by Bishop. Graduate / advanced undergraduate level ML text with a probabilistic / Bayesian focus.
  4. Machine Learning: a Probabilistic Perspective by Murphy. Comprehensive new ML textbook at graduate / advanced undergraduate level.
  5. The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman. Graduate level statistical view of many machine learning topics. Freely downloadable.
  6. Coursera Machine Learning course by Andrew Ng: outstanding free online ML course.
  7. Course handouts from Stanford CS 229 by Andrew Ng

The first three books (Alpaydin, Russell and Norvig, Bishop) are on three-hour reserve at MHC library. The fourth (Murphy) is ordered and will be on reserve soon. The others are available online for free.

MATLAB

Programming assignments will use MATLAB, which is installed on lab machines in Clapp 202 and Kendade 307. Here are some resources:

Course Objectives

The goals of the course are

Like many ML courses, this one is organized primarily as a sequence of specific techniques (see the schedule), which comprise a small subset of the available machine learning algorithms. We will learn about details of these specific techniques and also use them to explore cross-cutting concepts:

The skills learned in this class will prepare the student to explore much more widely within the field of machine learning.

Policies

The coursework will consist of:

The grading breakdown is:

Homework

Homework will be assigned and due approximately weekly. Assignments will be a mix of written problems, programming exercises, and experiments, and will tail off somewhat (become shorter and less frequent) toward the end of the semester to accommodate final projects. All code and other digital files should be submitted on moodle by the date and time indicated on the assignment. Problem solutions may be submitted by moodle (typed or scanned) or by hard copy. I will announce a procedure for submitting hard copies. Note that homework due dates will typically be on non-class days, when I am not at Mount Holyoke, so this will involve leaving the work in a drop box.

Late policy

Collaboration

Collaboration on assignments is encouraged. However, every student must write their own code, run their own experiments, and write their own solutions. Sharing of code or written solutions will be considered a violation of the honor code. Also, I highly encourage each student to first attempt problems on their own, especially for the shorter exercises that are designed to test and reinforce concepts taught in class. Please write the names of all collaborators at the beginning of the written portion of the submission.

Course Project

Students will work as individuals or in small groups on a final project. This can be either a hands-on application of machine learning algorithms learned in class to an interesting data set, or an in-depth exploration of a machine learning topic not covered in this class. Details will be announced later in the course.

Piazza

This term we will be using Piazza for class discussion. The system is highly catered to getting you help fast and efficiently from classmates and the instructor. Rather than emailing questions, I encourage you to post your questions on Piazza. If you have any problems or feedback for the developers, email team@piazza.com.

Find our class page at: https://piazza.com/mtholyoke/fall2014/cs335/home

Students are encouraged to help answer each other’s questions on Piazza. I will also monitor the discussion and answer questions. So, you are likely to get an answer very quickly. The official policy is that the instructor will read and respond (as necessary) to new posts within 24 hours on weekdays, and 48 hours on the weekend.

Participation

Participation includes arriving on time to class and required fourth hours, engaging meaningfully in lecture activities (e.g., peer discussions or exercises), giving a project presentation, and contributing in the way you are most comfortable to course discussions during lecture or on the class forum.

Accommodations

If you have a disability and would like to request accommodations, please contact AccessAbility Services, located in Wilder Hall B4, at (413) 538-2634 or accessability-services@mtholyoke.edu. If you are eligible, they will give you an accommodation letter which you should bring to me as soon as possible.