Chapter 9: Planning and Learning

12/8/99

Click here to start

Table of Contents

Chapter 9: Planning and Learning

Models

Planning

Planning Cont.

Learning, Planning, and Acting

Direct vs. Indirect RL

The Dyna Architecture (Sutton 1990)

The Dyna-Q Algorithm

Dyna-Q on a Simple Maze

Dyna-Q Snapshots: Midway in 2nd Episode

When the Model is Wrong: Blocking Maze

Shortcut Maze

What is Dyna-Q ?

Prioritized Sweeping

Prioritized Sweeping

Prioritized Sweeping vs. Dyna-Q

Rod Maneuvering (Moore and Atkeson 1993)

Full and Sample (One-Step) Backups

Full vs. Sample Backups

Trajectory Sampling

Trajectory Sampling Experiment

Heuristic Search

Summary

Author: Andy Barto

Email: barto@cs.umass.edu

Download presentation source