Efficiently Learning Linear-Linear Exponential Family Predictive Representations of State
Description
Exponential Family PSR (EFPSR) models capture stochastic dynamical systems by representing state as the parameters of an exponential family distribution over a short-term window of future observations. They are appealing from a learning perspective because they are fully observed (meaning expressions for maximum likelihood do not involve hidden quantities), but are still expressive enough to both capture existing models (such as POMDPs and linear dynamical systems) and predict new models. While learning algorithms based on maximizing exact likelihood exist, they are not computationally feasible. We present a new, computationally efficient, learning algorithm based on an approximate likelihood function. The algorithm can be interpreted as attempting to induce stationary distributions of observations, features and states which match their empirically observed counterparts. The approximate likelihood, and the idea of matching stationary distributions, may have application in other models.
| Slides | |
| 0:00 | Efficiently Learning Linear-Linear Exponential Family Predictive Representations of State |
| 0:20 | Outline |
| 0:43 | The Exponential Family PSR |
| 0:45 | Modeling Dynamical Systems (1) |
| 1:30 | Modeling Dynamical Systems (2) |
| 2:06 | Examples of PSRs |
| 3:26 | Distribution of Short-Term Future |
| 4:14 | Maintaining State |
| 6:08 | Learning an EFPSR |
| 7:03 | The Linear-Linear Exponential Family PSR |
| 7:09 | Linear-Linear EFPSR |
| 8:22 | Exact Likelihood for ML Learning |
| 9:25 | Results of Exact ML on POMDPs |
| 10:35 | A Tractable Learning Algorithm |
| 10:40 | Why is Exact ML Intractable? |
| 12:07 | Approximate Likelihood for ML Learning |
| 13:36 | Interpretation of Approximate Likelihood (1) |
| 14:34 | Interpretation of Approximate Likelihood (2) |
| 14:59 | Interpretation of Approximate Likelihood (3) |
| 15:17 | Interpretation of Approximate Likelihood (4) |
| 16:13 | Interpretation of Approximate Likelihood (5) |
| 16:32 | Experiments |
| 16:34 | Evaluating with RL |
| 18:28 | Example: Bouncing Ball Problem (1) |
| 19:48 | Example: Bouncing Ball Problem (2) |
| 20:36 | Example: Bouncing Ball Problem (3) |
| 20:46 | Bouncing Ball Results |
| 21:16 | The Robot Domain |
| 22:14 | Robot Domain Results |
| 23:02 | Conclusions |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
SEE ALSO:
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !




