Skip to content. Skip to navigation
CIM Menus

Multigrid-based Algorithms for Temporal Difference Learning

Prof. Nahum Shimkin

February 16, 2005 at  10:00 AM
Zames Seminar Room - MC437

We introduce a new class of multigrid-based temporal-difference learning algorithms for speeding up policy evaluation, within the context of Reinforcement Learning for discounted cost Markov Decision Processes with linear cost function approximation. We adapt the well-established multigrid framework to the learning problem, propose two classes of algorithms in which TD(lambda) is applied at various resolution scales, and analyze the convergence of these schemes. We further discuss the utility of algebraic multigrid methods for the automatic construction of basis function hierarchies. Some initial experimental results will be finally presented.