Contextual Bandits with Similarity Information

Published on 2011-08-023746 Views

Aleksandrs Slivkins

In a multi-armed bandit (MAB) problem, an online algorithm makes a sequence of choices. In each round it chooses from a time-invariant set of alternatives and receives the payoff associated with thi

COLT 2011 - Budapest

Presentation

Contextual Bandits with Similarity Information00:00

Running example00:00

Multi-Armed Bandits00:18

Contextual Bandits01:12

Similarity info - 102:32

Similarity info - 203:55

Problem formulation04:51

Some background06:23

Prior work: uniform partitions - 107:20

Prior work: uniform partitions - 209:21

This paper: adaptive partitions11:04

Rest of the talk12:22

Algorithm: contextual zooming - 112:49

Algorithm: contextual zooming - 214:10

Unexpected application:slowly changing payoffs16:00

Unexpected application: sleeping bandits17:52

Other results18:53