
en-de
en-es
en-fr
en-pt
en-sl
en
en-zh
0.25
0.5
0.75
1.25
1.5
1.75
2
A Semi-parametric Statistical Approach to Model-free Policy Evaluation
Published on Feb 4, 20252992 Views
Reinforcement learning (RL) methods based on least-squares temporal difference (LSTD) have been developed recently and have shown good practical performance. However, the quality of their estimation h