
Optimal Online Learning Procedures for Model-Free Policy Evaluation
Published on 2009-10-202413 Views
In this study, we extend the framework of semiparametric statistical inference introduced recently to reinforcement learning (Ueno, et.al., 2008) to online learning procedures for policy evaluation. T