William Dabney

PhD Candidate
Autonomous Learning Lab
Department of Computer Science
University of Massachusetts Amherst

wdabney [at] cs [dot] umass [dot] edu


Abstract

There are a number of academic areas that continually captivate my concentration: Computer Science, Mathematics, Neuroscience, and Philosophy. At the intersection of all these lies the field of Reinforcement Learning. As a PhD candidate, working in the Autonomous Learning Lab at UMass Amherst, I get to try to meld these fields into (sometimes) coherent research topics that focus on theoretical and practical algorithms for learning optimal policies for controlling complex systems. Otherwise, I spend my time enjoying the variety of seasonal fun available here in New England, and, whenever possible, alternating between dancing and frustrating neighbors with my drumming!



Related Work
(My Anecdotal CV)

I began my research experiences at a summer Research Experiences for Undergrads (REU) program at the University of Oklahoma with Dr. Amy McGovern. We worked on relational reinforcement learning, which views the world in terms of the objects within it and their relationships with each other. We developed an algorithm, Relational U-Tree, which used a combination of decision tree algorithms to learn an abstract state space over relational observations and learned a control policy over those states. We have sense then extended this work to the case of independently represented multi-modal observations.

As a graduate student, I have worked with my advisor, Andrew Barto, on projects such as Mining Joseki Books for Reusable Skills in the Game of Go, Gradient Ascent Critic Optimization, and Eliminating the Step-Size Parameter in Online Temporal Difference Learning. More recently I've been working on algorithms to learn skill hierarchies for partially observable domains.

Since Summer 2011, I've also been working with members of Center for Intelligent Information Retrieval (CIIR) lab under direction from Dr. James Allan and Dr. David Smith. We've developed a distributed search system with applications to book search called Proteus. I've also been helping with the named entity disambiguation.



Experimental Results




Conclusion

Awesome.



References


Refereed conference papers

Dabney, W. and A. G. Barto (2012). Adaptive Step-Size for Online Temporal Difference Learning. Twenty-Sixth Conference on Artificial Intelligence. [pdf]

Dabney, W. and A. McGovern (2007). Utile Distinctions for Relational Reinforcement Learning. Twentieth International Joint Conference on Artificial Intelligence. [pdf]

Workshop papers

Dabney, W. and A. McGovern (2006). The Thing We Tried That Worked: Utile Distinctions for Relational Reinforcement Learning. Workshop on Advances in Statistical and Relational Learning at 23rd ICML.

Technical reports

Dabney, W. and A. G. Barto (2010). Gradient Ascent Critic Optimization. Tech. rep. UM-CS-2010-052. University of Massachusetts, Amherst. [pdf]