Preference-based policy iteration: Leveraging preference learning for reinforcement learning thumbnail
slide-image
Pause
Mute
Subtitles not available
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Preference-based policy iteration: Leveraging preference learning for reinforcement learning

Published on Nov 30, 20113076 Views

This paper makes a first step toward the integration of two subfields of machine learning, namely preference learning and reinforcement learning (RL). An important motivation for a "preference-based"

Related categories

Chapter list

Preference-Based Policy Iteration00:00
Classical Reinforcement Learning00:44
Policy learning01:57
Vision: Preference-Based Reinforcement Learning03:00
Example: Annotated Chess Games - 103:35
Example: Annotated Chess Games - 204:18
Example: Annotated Chess Games - 305:23
Approximate Policy Iteration with Roll-Outs - 106:32
Approximate Policy Iteration with Roll-Outs - 207:51
Label Ranking08:58
Preference-Based Policy Iteration10:22
Advantages of a preference-based framework11:31
Case Study 113:30
Results: Inverted Pendulum15:15
Results: Mountain Car17:43
Complete vs. Partial State Evaluation17:47
Case Study 2: Learning from Qualitative Feedback - 117:48
Case Study 2: Learning from Qualitative Feedback - 219:04
Conclusions19:37
Open Questions19:44
While you ask questions...19:44