From Bandits to Experts : On the Value of More Information

Published on 2011-07-253089 Views

Ohad Shamir

Learning from Experts and Multi-armed Bandits are two of the most common settings studied in online learning. Whereas the first setting assumes that the performance of all k actions are revealed at th

On‐lineTrading of Exploration and Exploitation 2011 - Washington

Related categories

Exploration vs. Exploitation

Presentation

From Bandits to Experts: On the Value of More Information00:00

Experts / Multi-armed Bandits (1)00:04

Experts / Multi-armed Bandits (2)01:06

Model02:09

Examples (1)02:53

Examples (2)03:13

Examples (3)03:22

Motivation03:42

First attempt: the ExpBan Algorithm (1)05:01

First attempt: the ExpBan Algorithm (2)05:21

First attempt: the ExpBan Algorithm (3)05:27

First attempt: the ExpBan Algorithm (4)05:48

First attempt: the ExpBan Algorithm (5)06:17

Lower Bound07:37

Proof Intuition08:26

A Better Algorithm09:02

Regret (1)09:51

Proof Ideas10:54

Regret (2)11:49

Experiments12:37

Conclusions13:52

arXiv Tech Report14:47