en
0.25
0.5
0.75
1.25
1.5
1.75
2
Sparse Reinforcement Learning in High Dimensions
Published on Apr 25, 20123701 Views
Related categories
Chapter list
Sparse Reinforcement Learning in High Dimensions00:00
Project’s Description (1)00:06
Project’s Description (2)00:21
Publications (1)01:17
Publications (2)01:29
Sequential Decision-Making under Uncertainty01:47
Reinforcement Learning (RL)02:43
Markov Decision Process03:50
Value Function05:25
Optimal Value Function and Optimal Policy07:11
Properties of Bellman Operators07:55
Dynamic Programming Algorithms (1)08:24
Dynamic Programming Algorithms (2)09:08
Approximate Dynamic Programming (ADP)10:20
ADP Algorithms (1)11:12
ADP Algorithms (2)11:39
Curse of Dimensionality12:09
Motivation for Studying RL in High Dimensions12:52
Value Function Approximation (VFA)14:48
Feature Selection (1)15:42
Feature Selection in Value Function Approximation18:45
Feature Selection (2)19:37
Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010)19:52
Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010): Summary20:06
Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010): Feature Selection21:54
Using random projections in RL and regression24:57
Compressed Least-Squares Regression (NIPS 2009)25:11
Compressed Least-Squares Regression: Summary25:17
LSTD with Random Projections (NIPS 2010)29:31
LSTD with Random Projections: Problem29:44
LSTD with Random Projections: Results (1)30:20
LSTD with Random Projections: Results (2)31:24
Bandit Theory Meets Compressed Sensing for High-Dimensional Stochastic Linear Bandit (AISTATS 2012)33:04
Bandit Theory Meets Compressed Sensing for High-Dimensional Stochastic Linear Bandit: Summary33:13
Using sparsity in value function approximation34:10
Finite-Sample Analysis of Lasso-TD (ICML 2011)34:27
Finite-Sample Analysis of Lasso-TD: Summary (1)34:33
Finite-Sample Analysis of Lasso-TD: Summary (2)38:45
Project’s Achievements40:05
Towards Adaptive RL Algorithms (1)42:20
Towards Adaptive RL Algorithms (2)43:59
Future Work46:28
Thank You!48:05