event thumbnail image
Principled methods of trading exploration and exploitation

Multiarmed Bandits and Partial Monitoring Exploration and Exploitation using Upper Confidence Bounds

author: Nicolò Cesa-Bianchi, Università degli Studi di Milano
You might be experiencing some problems with Your Video player.
Slides
0:00 MULTIARMED BANDITS
1:36 THE BANDIT PROBLEM
5:07 FINITE-TIME REGRET
13:02 HORIZON-DEPENDENT REWARD DISTRIBUTIONS
16:55 HORIZON-DEP. REWARD DISTRIBUTIONS (CONT.)
22:21 THE NONSTOCHASTIC BANDIT PROBLEM
27:12 THE NONSTOCHASTIC BANDIT PROBLEM
29:33 A NEARLY OPTIMAL RANDOMIZED POLICY
30:15 THE NONSTOCHASTIC BANDIT PROBLEM
31:00 A NEARLY OPTIMAL RANDOMIZED POLICY
37:42 PROOF 1/2
39:16 A NEARLY OPTIMAL RANDOMIZED POLICY
39:20 PROOF 1/2
39:35 PROOF 2/2
39:41 PROOF 1/2
39:47 PROOF 2/2
40:15 A NEARLY OPTIMAL RANDOMIZED POLICY
40:21 PROOF 2/2
40:56 A NEARLY OPTIMAL RANDOMIZED POLICY
41:18 PROOF 2/2
42:03 PROOF 1/2
42:05 A POINTWISE BOUND
43:20 REGRET BOUNDS
47:12 VARIANCE PROBLEM
47:23 REGRET BOUNDS
47:28 VARIANCE PROBLEM
48:17 REGRET BOUNDS THAT HOLD W.H.P.
49:43 COMPETING AGAINST ARBITRARY POLICIES
51:33 TRACKING REGRET
51:51 A BOUND ON THE TRACKING REGRET
57:48 PARTIAL MONITORING
57:53 FORECASTING A SEQUENCE
60:25 PREDICTION WITH EXPERT ADVICE
61:20 MULTIARMED BANDIT
62:23 PARTIAL MONITORING
64:12 EXAMPLES: APPLE TASTING
66:26 EXAMPLES: LABEL EFFICIENT FORECASTING
67:53 EXAMPLES: DYNAMIC PRICING
70:23 CONTROLLING THE REGRET
72:55 THE GENERAL FORECASTER FOR PARTIAL MONITORING
73:10 CONTROLLING THE REGRET
73:16 REGRET BOUNDS
74:03 LOWER BOUNDS
74:24 EXAMPLES: APPLE TASTING
74:38 EXAMPLES: LABEL EFFICIENT FORECASTING
74:47 EXAMPLES: DYNAMIC PRICING
75:44 EXAMPLES: LABEL EFFICIENT FORECASTING
77:01 CONTROLLING THE REGRET
77:24 LOWER BOUNDS
78:14 A STRATEGY FOR REVEALING ACTIONS

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: