On a Connection between Importance Sampling and the Likelihood Ratio Policy Gradient
Published on Mar 25, 20113583 Views
Likelihood ratio policy gradient methods have been some of the most successful reinforcement learning algorithms, especially for learning on physical systems. We describe how the likelihood ratio poli