Sparse Reinforcement Learning in High Dimensions

Published on 2012-04-253706 Views

Mohammad Ghavamzadeh

Workshops 2012 - Cumberland Lodge

Related categories

Presentation

Sparse Reinforcement Learning in High Dimensions00:00

Project’s Description (1)00:06

Project’s Description (2)00:21

Publications (1)01:17

Publications (2)01:29

Sequential Decision-Making under Uncertainty01:47

Reinforcement Learning (RL)02:43

Markov Decision Process03:50

Value Function05:25

Optimal Value Function and Optimal Policy07:11

Properties of Bellman Operators07:55

Dynamic Programming Algorithms (1)08:24

Dynamic Programming Algorithms (2)09:08

Approximate Dynamic Programming (ADP)10:20

ADP Algorithms (1)11:12

ADP Algorithms (2)11:39

Curse of Dimensionality12:09

Motivation for Studying RL in High Dimensions12:52

Value Function Approximation (VFA)14:48

Feature Selection (1)15:42

Feature Selection in Value Function Approximation18:45

Feature Selection (2)19:37

Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010)19:52

Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010): Summary20:06

Adaptive Bases for Reinforcement Learning (ECML 2010) / Adaptive Bases for Q-learning (CDC 2010): Feature Selection21:54

Using random projections in RL and regression24:57

Compressed Least-Squares Regression (NIPS 2009)25:11

Compressed Least-Squares Regression: Summary25:17

LSTD with Random Projections (NIPS 2010)29:31

LSTD with Random Projections: Problem29:44

LSTD with Random Projections: Results (1)30:20

LSTD with Random Projections: Results (2)31:24

Bandit Theory Meets Compressed Sensing for High-Dimensional Stochastic Linear Bandit (AISTATS 2012)33:04

Bandit Theory Meets Compressed Sensing for High-Dimensional Stochastic Linear Bandit: Summary33:13

Using sparsity in value function approximation34:10

Finite-Sample Analysis of Lasso-TD (ICML 2011)34:27

Finite-Sample Analysis of Lasso-TD: Summary (1)34:33

Finite-Sample Analysis of Lasso-TD: Summary (2)38:45

Project’s Achievements40:05

Towards Adaptive RL Algorithms (1)42:20

Towards Adaptive RL Algorithms (2)43:59

Future Work46:28

Thank You!48:05