#### Informal Systems Seminar (ISS)

### Exploiting Stochastic Factorization for Efficiently Solving Markov Decision Processes

Doina Precup, Associate Professor

School of Computer Science, McGill University , Centre for Intelligent Machines (CIM) and Groupe d'Etudes et de Recherche en Analyse des Decisions (GERAD)

October 9, 2015 at 11:30 AM

Zames MC437

When a transition probability matrix is represented as the product
of two stochastic matrices, one can swap the factors of the
multiplication to obtain another transition matrix that retains some
fundamental characteristics of the original. Since the derived
matrix can be much smaller than its precursor, this property can be
exploited in the context of solving Markov decision processes
(MDPs). I will describe how we can use this property in order to
provide approximate solutions for MDPs much faster than by using
classical methods. For example, an approximate policy iteration
algorithm based on stochastic factorization has linear dependence on
the number of states in the model. I will briefly also discuss
learning algorithm based on this trick, and its relationship to
other types of matrix factorization, which we are beginning to
uncover.

This is joint work with Andre M.S. Baretto and Joelle Pineau.