Optimal Online Learning Procedures for Model-Free Policy Evaluation
Published on Oct 20, 20092406 Views
In this study, we extend the framework of semiparametric statistical inference introduced recently to reinforcement learning (Ueno, et.al., 2008) to online learning procedures for policy evaluation. T