Beating Bandits in Gradually Evolving Worlds

Published on 2013-08-093210 Views

Chia-Jung Lee

Consider the online convex optimization problem, in which a player has to choose actions iteratively and suffers corresponding losses according to some convex loss functions, and the goal is to minimi

COLT 2013 - Princeton

Related categories

Presentation

Beating Bandits in Gradually Evolving Worlds00:00

Outline00:06

Online Learning: Routing (Example) - 100:23

Online Learning: Routing (Example) - 200:34

Online Learning: Model - 100:43

Online Learning: Model - 201:53

Online Learning: Model - 302:19

Online Learning: Model - 402:29

Full Information Setting02:53

Online Problems02:59

Previous Results - 103:17

Previous Result: Deviation04:02

Previous Results - 204:37

Bandit Setting04:45

Previous Results - 304:49

Previous Results - 404:57

Previous Results - 505:12

Previous Results - 605:43

Difficulties in Bandit Setting05:53

Approach for Bandit - 105:58

Approach for Bandit - 206:23

Approach for Bandit - 306:43

Estimation - 107:23

Estimation - 208:26

Estimation - 308:33

Estimation - 408:37

Estimation - 508:48

Estimation - 609:30

Another Issue: Exploration09:48

Two-Point Bandit Setting11:07

Two-Point Bandit [ADX10]11:17

Motivation11:38

Previous Results - 711:59

Previous Results - 812:10

Our Results 12:30

Main Algorithm - 112:45

Gradient Descent12:59

Full information Algorithm’s idea - 113:43

Approach for Bandit14:40

Observation 15:12

First Try 16:03

Main Algorithm - 216:13

Full information Algorithm’s idea - 216:22

Algorithm’s Idea - 116:49

Algorithm - 117:42

Algorithm’s Idea - 218:36

Algorithm - 218:55

Analysis19:10

Results19:36

Thank you !!19:50