Reinforcement learning: Tutorial + Rethinking State, Action & Reward

author: Satinder Singh, Electrical Engineering and Computer Science Department, University of Michigan
published: June 15, 2010,   recorded: May 2010,   views: 1626
Categories
You might be experiencing some problems with Your Video player.

Slides

Slides
0:00 Reinforcement Learning
1:20 People
1:38 Outline
2:01 RL is Learning from Interaction
2:30 RL Abstractly...
2:42 RL and Machine Learning
3:31 (Partial) List of Applications
3:33 Markov Decision Process (MDP)
4:29 Bellman Optimality Equations
5:58 Planning (Policy Evalution)
6:49 Planning (Optimal Control)
7:07 Convergence of Value Iteration
8:04 Learning in MDPs
8:46 Indirect Methods for Learning in MDPs
9:22 Direct Method: Q-Learning
9:59 Q-Learning Convergence w.p.1
10:10 So far...
11:01 Exploration‐Exploitation
13:11 So far…
13:39 General Idea
13:56 Gradient Descent
14:08 General Idea
14:19 Gradient Descent
14:33 Sparse Coarse Coding
14:38 FAs & RL
15:16 Sampling Trees Approach
15:47 Sparse Sampling
18:22 So far…
21:41 States?
22:16 Approaches
24:41 In this part…
24:59 POMDPs…
26:12 Predictions / futures
27:00 System Dynamics Vector
28:07 System Dynamics Matrix - 1
29:13 System Dynamics Matrix - 2
30:17 System Dynamics Matrix - 3
31:44 nth‐order Markov Models
32:50 K‐history Markov models…
32:55 POMDPs… - 1
33:55 POMDPs… - 2
35:22 POMDPs… - 3
36:52 POMDPs… - 4
37:00 PSRs - 1
38:25 PSRs - 2
38:43 Updating Linear PSRs
40:31 Linear PSRs
42:43 Actions…
44:33 Actions?
46:00 Options
48:02 Rooms Example
48:28 Options define a Semi‐Markov Decison Process (SMDP)
49:33 What does the SMDP connection give us?
50:32 Models of Options
51:45 Example: Synchronous Value IterationGeneralized to Options
54:11 Landmarks Task
54:51 Termination Improvement for Landmarks Task
55:33 Intra‐Option Learning Methods for Markov Options - 2
56:47 Rewards…
56:58 Rewards?
59:04 Power and generality of RL
60:11 Preferences‐Parameters Confound

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
 
    Delicious Bibliography

 Watch videos:   (click on thumbnail to launch)

Watch Part 1
Part 1 1:02:20
!NOW PLAYING
Watch Part 2
Part 2 22:19

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: