event thumbnail image
Reinforcement Learning

Non-Parametric Policy Gradients: A Unified Treatment of Propositional and Relational Domains

author: Kurt Driessens, Catholic University of Leuven

Description

Policy gradient approaches are a powerful instrument for learning how to interact with the environment.Existing approaches have focused on propositional and continuous domains only. Without extensive feature engineering, it is difficult -- if not impossible -- to apply them within structured domains, in which e.g. there is a varying number of objects and relations among them. In this paper, we describe a non-parametric policy gradient approach -- called NPPG -- that overcomes this limitation. The key idea is to apply Friedmann's gradient boosting: policies are represented as a weighted sum of regression models grown in an stage-wise optimization. Employing off-the-shelf regression learners, NPPG can deal with propositional, continuous, and relational domains in a unified way. Our experimental results show that it can even improve on established results.

You might be experiencing some problems with Your Video player.
Slides
0:00 Non-Parametric Policy Gradient
0:36 Take Away Message
1:28 Overview
2:04 Reinforcement Learning
2:56 World Value
3:48 (steady) State Distribution
4:24 Value Functions
5:00 Direct Policy Learning
6:08 Policy Gradients with Function Approximation
7:15 Non-Parametric Policy Gradient
8:42 Functional Gradient Boosting
9:49 Functional Gradient Boosting (2)
11:35 In Practice
12:08 Local Evaluation
12:59 Gradient Tree Boosting
14:09 Some Results
20:21 Future Work
21:15 Summary
21:58 The End!
22:35 - Questions

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: