
High Confidence Policy Improvement
Published on 2015-12-051571 Views
We present a batch reinforcement learning (RL) algorithm that provides probabilistic guarantees about the quality of each policy that it proposes, and which has no hyper-parameter that requires expert