High Confidence Policy Improvement

Published on 2015-12-051574 Views

Philip S. Thomas

We present a batch reinforcement learning (RL) algorithm that provides probabilistic guarantees about the quality of each policy that it proposes, and which has no hyper-parameter that requires expert

ICML 2015 - Lille

Related categories