Philip S. Thomas, M.S.

Graduate Student
Autonomous Learning Laboratory

Department of Computer Science
University of Massachusetts Amherst
140 Governors Drive
Amherst, MA 01003-9264 U.S.A.

pthomas [æt] cs [daat] umass [daat] edu
Adviser: Andrew Barto


The web Search PSThomas.com

Research Interests:

My general area of interest is reinforcement learning (RL): creating machines that can learn when provided rewards relating to their performance. They learn in a way similar to the way dogs are trained: when they do something well, they are given a reward. When they do something wrong, they are punished. My interest in RL stems from these similarities to the ways in which animals (including humans) learn.

One problem with RL is that simple problems can be hard to solve if they are poorly represented. For example, when learning to play tic-tac-toe, you think in terms of where to place x's and o's. This problem is solvable by a common RL agent. However, if the agent is given a grayscale picture of the board (taken from a slightly different angle each time), the problem becomes intractable for modern RL agents. To see why, think of the image as a series of numbers. Imagine you are given a large grid of numbers (which happen to represent how dark each pixel in the image is). If you didn't know before hand that these numbers represent a board with x's and o's, the task is almost impossible! You see some numbers, choose one of 9 actions (where to move), get a reward, and see a completely different set of numbers (that you've probably never seen before). How do you learn the best action to take in each case?

To tackle this problem, researchers have worked on feature discovery, which involves methods for extracting features from the representation provided. In our example, this would be similar to trying to extract the 3 by 3 tic-tac-toe board from the grid of numbers representing the image. The RL agent can then learn on this new set of featuers (3x3 grid) rather than the old features (image). My current research involves a novel method for performing feature discovery.

Publications

Resume

Useful C++ Stuff

Logic Puzzles

Links


Created: 6-24-2009. Last Modified: 7-6-2010. © Philip S. Thomas