en
0.25
0.5
0.75
1.25
1.5
1.75
2
Direct Policy Ranking with Robot Data Streams
Published on Nov 30, 20113468 Views
Many machine learning approaches in robotics, based on reinforcement learning, inverse optimal control or direct policy learning, critically rely on robot simulators. This paper investigates a simulat
Related categories
Chapter list
Preference-based Policy Learning00:00
Setting00:28
Motivations01:33
State of art - 102:08
Issues in RL 03:02
State of art - 203:49
Issues in IRL04:20
Preference-based Policy Learning04:32
Outline - 105:15
Policy Return Estimate - 105:37
Policy Return Estimate - 206:21
Behavioral representation07:23
Exploration/Exploitation08:27
Self-training09:24
Preference-based Policy Learning PPL Algorithm10:38
Outline - 211:16
Experimental goal and setting11:20
The maze problem13:05
Synchronized exploration14:07
Outline - 314:56
Preference Policy Learning14:59
Future work15:44
References16:52