Preference-based policy iteration: Leveraging preference learning for reinforcement learning

Published on 2011-11-303101 Views

Johannes Fürnkranz

This paper makes a first step toward the integration of two subfields of machine learning, namely preference learning and reinforcement learning (RL). An important motivation for a "preference-based"

Sessions

Related categories

Reinforcement Learning

Presentation

Preference-Based Policy Iteration00:00

Classical Reinforcement Learning00:44

Policy learning01:57

Vision: Preference-Based Reinforcement Learning03:00

Example: Annotated Chess Games - 103:35

Example: Annotated Chess Games - 204:18

Example: Annotated Chess Games - 305:23

Approximate Policy Iteration with Roll-Outs - 106:32

Approximate Policy Iteration with Roll-Outs - 207:51

Label Ranking08:58

Advantages of a preference-based framework11:31

Case Study 113:30

Results: Inverted Pendulum15:15

Results: Mountain Car17:43

Complete vs. Partial State Evaluation17:47

Case Study 2: Learning from Qualitative Feedback - 117:48

Case Study 2: Learning from Qualitative Feedback - 219:04

Conclusions19:37

Open Questions19:44

While you ask questions...19:44