Sparse Reinforcement Learning in High Dimensions

author: Mohammad Ghavamzadeh, INRIA Lille - Nord Europe
published: April 25, 2012,   recorded: March 2012,   views: 133
Categories
You might be experiencing some problems with Your Video player.

Slides

Slides
0:00 Sparse Reinforcement Learning in High Dimensions
0:06 Project’s Description (1)
0:21 Project’s Description (2)
1:17 Publications (1)
1:29 Publications (2)
1:47 Sequential Decision-Making under Uncertainty
2:43 Reinforcement Learning (RL)
3:50 Markov Decision Process
5:25 Value Function
7:11 Optimal Value Function and Optimal Policy
7:45 Value Function
7:50 Optimal Value Function and Optimal Policy
7:55 Properties of Bellman Operators
8:24 Dynamic Programming Algorithms (1)
9:08 Dynamic Programming Algorithms (2)
10:20 Approximate Dynamic Programming (ADP)
11:12 ADP Algorithms (1)
11:39 ADP Algorithms (2)
12:09 Curse of Dimensionality
12:52 Motivation for Studying RL in High Dimensions
14:48 Value Function Approximation (VFA)
15:42 Feature Selection (1)
18:45 Feature Selection in Value Function Approximation
19:37 Feature Selection (2)
19:52 Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010)
20:06 Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010): Summary
21:54 Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010): Feature Selection
24:57 Using random projections in RL and regression
25:11 Compressed Least-Squares Regression (NIPS 2009)
25:17 Compressed Least-Squares Regression: Summary
29:31 LSTD with Random Projections (NIPS 2010)
29:44 LSTD with Random Projections: Problem
30:20 LSTD with Random Projections: Results (1)
31:24 LSTD with Random Projections: Results (2)
33:04 Bandit Theory Meets Compressed Sensing for High-Dimensional Stochastic Linear Bandit (AISTATS 2012)
33:13 Bandit Theory Meets Compressed Sensing for High-Dimensional Stochastic Linear Bandit: Summary
34:10 Using sparsity in value function approximation
34:27 Finite-Sample Analysis of Lasso-TD (ICML 2011)
34:33 Finite-Sample Analysis of Lasso-TD: Summary (1)
38:45 Finite-Sample Analysis of Lasso-TD: Summary (2)
40:05 Project’s Achievements
42:20 Towards Adaptive RL Algorithms (1)
43:59 Towards Adaptive RL Algorithms (2)
46:28 Future Work
48:05 Thank You!

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
 
    Delicious Bibliography

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: