Skip to content. Skip to navigation
CIM Menus
 

Informal Systems Seminar (ISS)

Exploiting Stochastic Factorization for Efficiently Solving Markov Decision Processes


Doina Precup, Associate Professor
School of Computer Science, McGill University Centre for Intelligent Machines (CIM) and Groupe d'Etudes et de Recherche en Analyse des Decisions (GERAD)

October 9, 2015 at  11:30 AM
Zames MC437

When a transition probability matrix is represented as the product of two stochastic matrices, one can swap the factors of the multiplication to obtain another transition matrix that retains some fundamental characteristics of the original. Since the derived matrix can be much smaller than its precursor, this property can be exploited in the context of solving Markov decision processes (MDPs). I will describe how we can use this property in order to provide approximate solutions for MDPs much faster than by using classical methods. For example, an approximate policy iteration algorithm based on stochastic factorization has linear dependence on the number of states in the model. I will briefly also discuss learning algorithm based on this trick, and its relationship to other types of matrix factorization, which we are beginning to uncover.

This is joint work with Andre M.S. Baretto and Joelle Pineau.