Off-policy Model-based Learning under Unknown Factored Dynamics thumbnail
Pause
Mute
Subtitles not available
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Off-policy Model-based Learning under Unknown Factored Dynamics

Published on Sep 27, 20151747 Views

Off-policy learning in dynamic decision problems is essential for providing strong evidence that a new policy is better than the one in use. But how can we prove superiority without testing the new po

Related categories