event thumbnail image
Reinforcement Learning

Efficiently Learning Linear-Linear Exponential Family Predictive Representations of State

author: Satinder Singh, Public Policy and Sociology, University of Michigan

Description

Exponential Family PSR (EFPSR) models capture stochastic dynamical systems by representing state as the parameters of an exponential family distribution over a short-term window of future observations. They are appealing from a learning perspective because they are fully observed (meaning expressions for maximum likelihood do not involve hidden quantities), but are still expressive enough to both capture existing models (such as POMDPs and linear dynamical systems) and predict new models. While learning algorithms based on maximizing exact likelihood exist, they are not computationally feasible. We present a new, computationally efficient, learning algorithm based on an approximate likelihood function. The algorithm can be interpreted as attempting to induce stationary distributions of observations, features and states which match their empirically observed counterparts. The approximate likelihood, and the idea of matching stationary distributions, may have application in other models.

You might be experiencing some problems with Your Video player.
Slides
0:00 Efficiently Learning Linear-Linear Exponential Family Predictive Representations of State
0:20 Outline
0:43 The Exponential Family PSR
0:45 Modeling Dynamical Systems (1)
1:30 Modeling Dynamical Systems (2)
2:06 Examples of PSRs
3:26 Distribution of Short-Term Future
4:14 Maintaining State
6:08 Learning an EFPSR
7:03 The Linear-Linear Exponential Family PSR
7:09 Linear-Linear EFPSR
8:22 Exact Likelihood for ML Learning
9:25 Results of Exact ML on POMDPs
10:35 A Tractable Learning Algorithm
10:40 Why is Exact ML Intractable?
12:07 Approximate Likelihood for ML Learning
13:36 Interpretation of Approximate Likelihood (1)
14:34 Interpretation of Approximate Likelihood (2)
14:59 Interpretation of Approximate Likelihood (3)
15:17 Interpretation of Approximate Likelihood (4)
16:13 Interpretation of Approximate Likelihood (5)
16:32 Experiments
16:34 Evaluating with RL
18:28 Example: Bouncing Ball Problem (1)
19:48 Example: Bouncing Ball Problem (2)
20:36 Example: Bouncing Ball Problem (3)
20:46 Bouncing Ball Results
21:16 The Robot Domain
22:14 Robot Domain Results
23:02 Conclusions

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: