Machine Learning and Friends Lunch

Solving Large Problem Domains with Goal Regression

Abstract

Although there are many existing techniques for finding optimal
policies in a variety of types of domains, these techniques typically
require enumeration of the underlying state space. This property
limits the size of the domains that can be solved by such techniques.
Some intractibly large domains, however, can be described by a compact
set of rules. This work seeks to leverage such domain descriptions to
solve large domains.

This work defines a class of automatons that describe
fully-observable, non-stochastic, sequential systems in which one or
more agents interact. An algorithm will be presented that solves for
an optimal policy in such automata without enumerating the underlying
state-space. This is done by constructing equivalence classes of
problem states and regressing them through the automaton. This
technique was applied to the domain of Connect Four and generated
orders of magnitude fewer classes than states in the underlying state
space.

Back to ML Lunch home