video thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Optimal Online Learning Procedures for Model-Free Policy Evaluation

Published on 2009-10-202413 Views

In this study, we extend the framework of semiparametric statistical inference introduced recently to reinforcement learning (Ueno, et.al., 2008) to online learning procedures for policy evaluation. T