Menu

Model-Free Reinforcement Learning as Mixture Learning

calendar icon Aug 26, 2009 3703 views
video thumbnail
Pause
Mute
speed icon
speed icon
0.25
0.5
0.75
1
1.25
1.5
1.75
2

We cast model-free reinforcement learning as the problem of maximizing the likelihood of a probabilistic mixture model via sampling, addressing both the in nite and nite horizon cases. We describe a Stochastic Approximation EM algorithm for likelihood maximization that, in the tabular case, is equivalent to a non-bootstrapping optimistic policy iteration algorithm like Sarsa(1) that can be applied both in MDPs and POMDPs. On the theoretical side, by relating the proposed stochastic EM algorithm to the family of optimistic policy iteration algorithms, we provide new tools that permit the design and analysis of algorithms in that family. On the practical side, preliminary experiments on a POMDP problem demonstrated encouraging results.

RELATED CATEGORIES

MORE VIDEOS FROM THE EVENT

MORE VIDEOS FROM THE SAME CATEGORIES

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license.