CMPSCI 687 |
Reinforcement Learning |
Spring 2006 |
|
Course Information
This course will provide a comprehensive introduction to reinforcement learning, a powerful approach to learning from interaction to achieve goals in stochastic and
incompletely-known environments. Reinforcement learning has adapted key ideas from machine learning, operations research, control theory, psychology, and
neuroscience to produce some strikingly successful engineering applications.
The focus is on algorithms for learning what actions to take, and when to take them, so as to
optimize long-term performance. This may involve sacrificing immediate
reward to obtain greater reward in the long-term or just to obtain more
information about the environment. The course will cover Markov decision
processes, dynamic programming, temporal-difference learning, Monte Carlo
reinforcement learning methods, eligibility traces, the role of function approximation, and the
integration of learning and planning. We will also introduce policy gradient methods, methods for partially observable problems, hierarchical learning, and connections to the brain's reward systems.
Lecture:
Tuesday & Thursday 9:30-10:45, CMPS 150
Prerequisites:
Interest in learning approaches to artificial
intelligence; basic probability theory; computer programming ability.
If you have passed Math 515 or equivalent, you have enough basic probility theory.
If you have passed a programming course at the level of CMPSCI 287, you have enough programming ability; knowledge of C++ is recommended. Please talk with the instructor if you want to take the course but have doubts about your qualifications.
Credit:
3 units
Instructor:
Andrew Barto, barto [at] cs [dot] umass [dot] edu, 545-2109
- Office hours: Tuesdays 11:00-12:00 except on 2/14, 3/14, 4/11, 4/18, 5/9 and Wednesdays 1:30-3:00 except on 4/19 (and not during Spring Break) CMPS 272
Teaching assistant:
Andrew Stout,
[andrew's last name]@cs.umass.edu
Required book:
We will be using a textbook by R. S. Sutton and A. G. Barto:
Reinforcement Learning:
An Introduction. Cambridge, MA: MIT Press, 1998.
Clicking on the title will take you to a full description of the book, from which you can obtain a detailed look at what will be covered in this course. I did not order the book through the Textbook Annex or a local bookstore. The full text of the book is on the web, so you don't really need to buy the book (though the book would be more convenient!)
The Plan:
The plan is to cover the complete contents of the book, plus supplementary readings that will be made available when needed.
Some of these will be assigned. The course schedule will indicate when you should be finished reading each of those. Others are suggested readings. See the detailed schedule by clicking here or the Schedule link at the bottom of the page. The schedule is subject to revision!
Required work:
- Written exercises:
Several exercise sets will be assigned. Most but not all of the exercises will be found in the textbook. The assignments and their due dates are indicated on the schedule. Hand in paper versions at the beginning of class the day they are due. All exercises will be marked and returned to you. In most cases, answer sheets for each exercise set will be made available at the end of the class on which the exercise is due. So you have to turn in your exercises on time. You are expected to spend time studying the answers provided. Since we have a large class, you should work in teams of two on these exercise sets. That is, one paper for each team will be handed in for each exercise set. If no consensus can be reached within a team as the answer of a question, individuals may hand in separate answers. You will be asked to inform the TA as to the composition of the teams, and we expect that teams will ordinarily remain the same throughout the term.
-
Programming Exercises:
Each team of two will complete four exercises requiring programming and for each will hand in results of their work on paper at the beginning of the class the day they are due. The programming assignments and their due dates are indicated on the schedule. Assignment details will appear here or by clicking the Homework link at the bottom of the page.
-
Exams:
There will be a closed-book in-class midterm and a closed-book final exam during the exam period. The midterm date is on the schedule. The final exam date has been announced: Monday May 22, 8:00 am, LGRT323. If you have an unavoidable conflict with the final exam date, you need to e-mail me at least two weeks before the exam.
Grading:
- Midterm (20%), Final (25%)
- Written homeworks (30%)
- Programming homeworks (25%)
Related Courses Elsewhere
Another Really Useful Link