Separated Trust Regions Policy Optimization Method thumbnail
Pause
Mute
Subtitles not available
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Separated Trust Regions Policy Optimization Method

Published on Mar 02, 202021 Views

In this work, we propose a moderate policy update method for reinforcement learning, which encourages the agent to explore more boldly in early episodes but updates the policy more cautious. Based on

Related categories