An efficient approach to stochastic optimal control
author:
Bert Kappen,
Radboud University Nijmegen
Description
Stochastic optimal control theory is a principled approach to compute optimal actions with delayed rewards. The use of this approach in AI and machine learning has been limited due to the computational intractabilities. In this talk, I introduce a class of control problems where the intractabilities appear as the computation of a partition sum, as in a statistical mechanical system. This opens the possibility to study phase transitions and to apply exisiting approximation methods such as BP and the variational method to optimal control theory. The talk gives a gentle introduction into control theory and illustrates these new phenomena with a number of examples.
You might be experiencing some problems with Your Video player.
| Slides | |
| 0:00 | An efficient approach to stochastic optimal control |
| 1:59 | Examples of control tasks (1) |
| 2:29 | Examples of control tasks (2) |
| 2:37 | Examples of control tasks (3) |
| 2:42 | Stochastic optimal control theory |
| 4:18 | Outline |
| 5:02 | Discrete time control (1) |
| 6:02 | Discrete time control (2) |
| 6:52 | Discrete time control (3) |
| 7:16 | Discrete time control (1) |
| 7:21 | Discrete time control (3) |
| 7:58 | Discrete time control (4) |
| 9:13 | Discrete time control (1) |
| 9:21 | Discrete time control (4) |
| 10:24 | Example: Bang-bang control (1) |
| 11:06 | Example: Bang-bang control (2) |
| 11:33 | Example: Bang-bang control (3) |
| 12:11 | Stochastic optimal control (1) |
| 13:06 | Stochastic optimal control (2) |
| 15:31 | Path integral control |
| 17:18 | Solution (1) |
| 17:26 | Path integral control |
| 17:36 | Solution (1) |
| 20:03 | Solution (2) |
| 21:07 | An example: double slit |
| 22:12 | Solution (1) |
| 22:18 | An example: double slit |
| 23:54 | The delayed choice (1) |
| 24:02 | An example: double slit |
| 24:12 | The delayed choice (1) |
| 24:18 | The delayed choice (2) |
| 25:28 | The delayed choice (1) |
| 25:52 | The delayed choice (2) |
| 26:10 | The delayed choice (1) |
| 26:36 | The delayed choice (2) |
| 27:14 | The delayed choice (3) |
| 28:08 | The diffusion process (1) |
| 28:38 | Path integral control |
| 28:43 | The diffusion process (1) |
| 29:43 | The diffusion process (2) |
| 30:40 | The path integral formulation |
| 31:54 | Gibbs sampling |
| 31:59 | Coordination of agents (1) |
| 33:41 | Coordination of agents (2) |
| 35:58 | Pseudo code |
| 36:45 | Coordination of agents (2) |
| 36:50 | Pseudo code |
| 36:57 | A simple 1d example (1) |
| 37:47 | A simple 1d example (2) |
| 38:26 | A simple 1d example (1) |
| 38:39 | A simple 1d example (2) |
| 38:44 | A simple 1d example (3) |
| 39:27 | Computation Time |
| 39:53 | A simple 1d example (1) |
| 42:48 | A simple 1d example (2) |
| 43:09 | Nonlinear Coordination |
| 44:36 | Coordination of agents (2) |
| 44:42 | A simple 1d example (3) |
| 44:46 | Nonlinear Coordination |
| 44:59 | Computation Time |
| 45:33 | Summary |
| 48:11 | - Questions |
| 48:24 | - Questions |
| 49:00 | - Questions |
| 49:03 | - Questions |
| 49:08 | - Questions |
| 50:24 | - Questions |
| 50:27 | - Questions |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
SEE ALSO:
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !





