Exploration Scavenging

Published on 2008-08-063051 Views

Alexander Strehl

We examine the problem of evaluating a policy in the contextual bandit setting using only observations collected during the execution of another policy. We show that policy evaluation can be impossibl

Reinforcement Learning

Related categories

Exploration Scavenging

Alexander Strehl

Reinforcement Learning

Related categories

Presentation