Optimal Online Learning Procedures for Model-Free Policy Evaluation thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Optimal Online Learning Procedures for Model-Free Policy Evaluation

Published on Oct 20, 20092406 Views

In this study, we extend the framework of semiparametric statistical inference introduced recently to reinforcement learning (Ueno, et.al., 2008) to online learning procedures for policy evaluation. T