
0.25
0.5
0.75
1.25
1.5
1.75
2
Off-policy Model-based Learning under Unknown Factored Dynamics
Published on 2015-09-271810 Views
Off-policy learning in dynamic decision problems is essential for providing strong evidence that a new policy is better than the one in use. But how can we prove superiority without testing the new po