Agnostic KWIK learning and efficient approximate reinforcement learning

author: Csaba Szepesvári, Department of Computing Science, University of Alberta
published: Aug. 2, 2011,   recorded: July 2011,   views: 4002


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


A popular approach in reinforcement learning is to use a model-based algorithm, i.e., an algorithm that utilizes a model learner to learn an approximate model to the environment. It has been shown such a model-based learner is efficient if the model learner is efficient in the so-called \knows what it knows" (KWIK) framework. A major limitation of the standard KWIK framework is that, by its very definition, it covers only the case when the (model) learner can represent the actual environment with no errors. In this paper, we introduce the agnostic KWIK learning model, where we relax this assumption by allowing nonzero approximation errors. We show that with the new de finition that an efficient model learner still leads to an effcient reinforcement learning algorithm. At the same time, though, we find that learning within the new framework can be substantially slower as compared to the standard framework, even in the case of simple learning problems.

See Also:

Download slides icon Download slides: colt2011_szepesvari_agnostic_01.pdf (1.1 MB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: