event thumbnail image
Approximate Inference in Stochastic Processes and Dynamical Systems

An efficient approach to stochastic optimal control

author: Bert Kappen, Radboud University Nijmegen

Description

Stochastic optimal control theory is a principled approach to compute optimal actions with delayed rewards. The use of this approach in AI and machine learning has been limited due to the computational intractabilities. In this talk, I introduce a class of control problems where the intractabilities appear as the computation of a partition sum, as in a statistical mechanical system. This opens the possibility to study phase transitions and to apply exisiting approximation methods such as BP and the variational method to optimal control theory. The talk gives a gentle introduction into control theory and illustrates these new phenomena with a number of examples.

You might be experiencing some problems with Your Video player.
Slides
0:00 An efficient approach to stochastic optimal control
1:59 Examples of control tasks (1)
2:29 Examples of control tasks (2)
2:37 Examples of control tasks (3)
2:42 Stochastic optimal control theory
4:18 Outline
5:02 Discrete time control (1)
6:02 Discrete time control (2)
6:52 Discrete time control (3)
7:16 Discrete time control (1)
7:21 Discrete time control (3)
7:58 Discrete time control (4)
9:13 Discrete time control (1)
9:21 Discrete time control (4)
10:24 Example: Bang-bang control (1)
11:06 Example: Bang-bang control (2)
11:33 Example: Bang-bang control (3)
12:11 Stochastic optimal control (1)
13:06 Stochastic optimal control (2)
15:31 Path integral control
17:18 Solution (1)
17:26 Path integral control
17:36 Solution (1)
20:03 Solution (2)
21:07 An example: double slit
22:12 Solution (1)
22:18 An example: double slit
23:54 The delayed choice (1)
24:02 An example: double slit
24:12 The delayed choice (1)
24:18 The delayed choice (2)
25:28 The delayed choice (1)
25:52 The delayed choice (2)
26:10 The delayed choice (1)
26:36 The delayed choice (2)
27:14 The delayed choice (3)
28:08 The diffusion process (1)
28:38 Path integral control
28:43 The diffusion process (1)
29:43 The diffusion process (2)
30:40 The path integral formulation
31:54 Gibbs sampling
31:59 Coordination of agents (1)
33:41 Coordination of agents (2)
35:58 Pseudo code
36:45 Coordination of agents (2)
36:50 Pseudo code
36:57 A simple 1d example (1)
37:47 A simple 1d example (2)
38:26 A simple 1d example (1)
38:39 A simple 1d example (2)
38:44 A simple 1d example (3)
39:27 Computation Time
39:53 A simple 1d example (1)
42:48 A simple 1d example (2)
43:09 Nonlinear Coordination
44:36 Coordination of agents (2)
44:42 A simple 1d example (3)
44:46 Nonlinear Coordination
44:59 Computation Time
45:33 Summary
48:11 - Questions
48:24 - Questions
49:00 - Questions
49:03 - Questions
49:08 - Questions
50:24 - Questions
50:27 - Questions

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: