Machine Learning, Fall, 2017


Course Number: COMPSCI 589
Time: MW / 2:30-3:45 PM
Room: Goessmann Lab. Add rm 64
Instructor: Justin Domke
Staff Email: Please usa Piazza
Course Website: Detailed materials for the course will be hosted on Moodle. Syllabus (this page) is at http://people.cs.umass.edu/domke/courses/cs589/
Instructor Office Hours: Monday 3:45pm-4:45pm (CICS 208)
TA Hours
  1. Wednesday 3:45pm-4:4pm (LGRT 220)
  2. Friday 11:00am-noon (LGRT 225) note: we previously has incorrect information that this was LGRC. Correct room is LGRT 225.
  3. More hours will be added during high volume periods

Course Description: This course will introduce core machine learning models and algorithms for classification, regression, clustering, and dimensionality reduction. On the theory side, the course will focus on understanding models and the relationships between them. On the applied side, the course will focus on effectively using machine learning methods to solve real-world problems with an emphasis on model selection, regularization, design of experiments, and presentation and interpretation of results.

Override questions: I constantly get questions about override requests for this course, so I apologize that I cannot answer your question individually. If you'd like to take this course but cannot register, please submit an override request through the online system. Above all, please describe your background in linear algebra, probability theory, and basic multivariate calculus. Please list any courses you've taken either in those topics, or using those topics or any other relevant training or experience you might have. Out of fairness, emailing me directly will not result in any preferential access. Note that it is likely there will be a substantial waitlist for the course, so I can't guarantee anyone entry to the course from the waitlist regardless of preparation.

Textbooks: The course readings will primarily be based on two open textbooks:

Grading Scheme:
Homework50%
Final30%
Quizzes15%
Course Participation5%

Homework: There will be five homework assignments.

Quizzes: There will be approximately 5-6 quizzes, each taking place in-class.
Course Final: Wed, December 20th, 3:30 - 5:30pm (In normal classroom)

Preliminary Schedule

  1. Introduction and Overview

    (Unit 1: Regression)

  2. Linear Regression, Ridge, and Lasso
  3. KNN Regression, Regression Trees, and Feature Selection
  4. Support Vector and Neural Network Regression
  5. KOLS and Gaussian Process Regression

    (Unit 2: Classification)

  6. K-Nearest Neighbors and Decision Trees
  7. Naive Bayes, LDA, and Logistic Regression
  8. Overfitting, Regularization, and Cross-Validation
  9. Support Vector Machines Basis Expansion, and Kernels
  10. Neural Networks and Deep Learning
  11. Ensembles and Classification

    (Unit 3: Kernels)

  12. Kernel Ridge Regression
  13. Support Vector Machines
  14. (Unit 4: Bayesian Methods)

  15. Bayesian Methods 1
  16. Bayesian Methods 2
  17. Markov Chain Monte Carlo
  18. Generative and Discriminative Methods

    (Unit 4: Unsupervised Learning)

  19. Hierarchical Clustering
  20. K-Means
  21. Mixture Models
  22. Linear Dimensionality Reduction and SVD
  23. Principal Components Analysis
  24. (Final)

  25. Final Review

What is the difference between CMPSCI 589 and CMPSCI 689?: 589 has been designed to focus on understanding and applying core machine learning models and algorithms, while 689 focuses on the mathematical foundations of machine learning. While both courses require a background in multivariate calculus, linear algebra, and probability; 689 is more theoretically focused and will use more of this background material than 589. In particular, 589 will not focus on deriving learning or optimization algorithms.

Should I take CMPSCI 589 or CMPSCI 689?: 589 is appropriate as an introductory machine learning course for senior undergraduate students, masters students, and MS/PhD students interested in applying machine learning in their research. Note that 589 can count for credit for MS/PhD students, but it does not satisfy an AI core requirement. Graduate students who intend to pursue research in machine learning or who need a course to satisfy the AI core requirement should take 689. Note also that students can take 589 followed by 689, but may not take the courses in the reverse order.

Required Background:While this course has an applied focus, it still requires appropriate mathematical background in probability and statistics, calculus and linear algebra. The official prerequisites for undergrads are CMPSCI 383 and MATH 235 (CMPSCI 240 provides sufficient background in probability and Math 131/132 provide sufficient background in calculus). Graduate students can check the descriptions for these courses to verify that they have sufficient mathematical background for 589. The course will also use Python as a programming language including the numpy, scipy, and scikit-learn. Some familiarity with Python will be helpful, but senior CS students should be able to learn Python during the course if needed. Graduate students from outside computer science with sufficient background are also welcome to take the course. The following references can provide a useful reviw:

Course Policies

  • Homework Submission: Homework assignments will generally consist of developing machine learning systems in Python, evaluating the systems, and producing written reports. Both the code and report must be submitted through Moodle by the due date for a submission to be considered on time.
  • Late Homework: To allow some flexibility to complete assignments given other constraints, you have a total of five free late days. You will be charged one late day for handing in an assignment within 24 hours after it is due, two late days for handing in an assignment within 48 hours after it is due, etc. Your assignment is considered late if either the written or code portions are submitted late. The late homework clock stops when both the written and code portions are submitted. After you have used up your late days, late homework will not count for credit except in special circumstances (ie: illness documented by a doctors note). If you do not hand in an assignment at all, this will count as using all five late days.
  • Homework Collaboration: You are encouraged to discuss assignments and course material with other students in person or on the course forums. However, you must show that you fully understand the solution to any homework problem arising from such collaboration by writing your own code, running your own experiments, and producing your own write-up for the problem.
  • Academic Honesty Policy: You are required to list the names of anyone you discuss problems with on the first page of your solutions. Copying any solution materials from external sources (books, web pages, etc.) or other students is considered cheating. To emphasize: no detectable copying is acceptable, even, e.g., copying a single sentence from an outside source. Sharing your code or solutions with other students is also considered cheating. Any detected cheating will result in a grade of -100% on the assignment for all students involved (negative credit), and potentially a grade of F in the course.
  • Re-grading Policy: Errors in grading of assignments and exams can occur despite the best efforts of the course staff. If you believe you've found a grading error, complete the online re- grade request form. Re-grade requests must be submitted no later than one week after the assignment is returned. Note that re-grading may result in your original grade increasing or decreasing as appropriate.