video thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Direct Policy Ranking with Robot Data Streams

Published on 2011-11-303473 Views

Many machine learning approaches in robotics, based on reinforcement learning, inverse optimal control or direct policy learning, critically rely on robot simulators. This paper investigates a simulat

Related categories

Presentation

Preference-based Policy Learning00:00
Setting00:28
Motivations01:33
State of art - 102:08
Issues in RL 03:02
State of art - 203:49
Issues in IRL04:20
Outline - 105:15
Policy Return Estimate - 105:37
Policy Return Estimate - 206:21
Behavioral representation07:23
Exploration/Exploitation08:27
Self-training09:24
Preference-based Policy Learning PPL Algorithm10:38
Outline - 211:16
Experimental goal and setting11:20
The maze problem13:05
Synchronized exploration14:07
Outline - 314:56
Preference Policy Learning14:59
Future work15:44
References16:52