Learning through Exploration

author: John Langford, Yahoo! Research
author: Alina Beygelzimer, IBM Watson Research Center
published: Oct. 1, 2010,   recorded: July 2010,   views: 1723
Categories
You might be experiencing some problems with Your Video player.

Slides

Slides
0:00 A Tutorial on Learning through Exploration
0:08 Example of Learning through Exploration
0:13 Another Example: Clinical Decision Making
0:55 The Contextual Bandit Setting (1)
1:53 The Contextual Bandit Setting (2)
2:51 The Contextual Bandit Setting (3)
2:55 The Contextual Bandit Setting (4)
7:53 The Contextual Bandit Setting (5)
8:38 Basic Observation #1
9:53 Basic Observation #2
10:43 Outline - online, stochastic
11:43 Idea 1: Follow the Leader (1)
15:21 Idea 1: Follow the Leader (2)
15:51 Idea 1: Follow the Leader (3)
18:16 Idea 2: Explore  then Follow the Leader (EFTL- ) (1)
19:00 Idea 2: Explore  then Follow the Leader (EFTL- ) (2)
20:09 Theorem, Proof
23:40 Unknown T
25:32 Idea 3: Exponential Weight Algorithm for Exploration and Exploitation with Experts
26:27 Idea 2: Explore  then Follow the Leader (EFTL- ) (2)
26:41 Unknown T
27:13 Idea 3: Exponential Weight Algorithm for Exploration and Exploitation with Experts
27:17 Theorem, Proof
28:36 Idea 3: Exponential Weight Algorithm for Exploration and Exploitation with Experts
34:13 Theorem: [Auer et al. '95] (1)
34:17 Theorem: [Auer et al. '95] (2)
34:33 EXP4 can be modi ed to succeed with high probability
35:18 Theorem, Proof
35:27 Summary so far
38:10 Outline, Argmax Regression
38:17 kdd2010_beygelzimer_langford_lte_Page_25
38:55 kdd2010_beygelzimer_langford_lte_Page_26
39:14 Approach 1: The Regression Approach (1)
40:34 Idea 3: Exponential Weight Algorithm for Exploration and Exploitation with Experts
40:51 Approach 1: The Regression Approach (1)
41:42 Approach 1: The Regression Approach (2)
44:38 Proof sketch: Fix x (graph)
48:04 Approach 2: Importance-Weighted Classi cation Approach (Zadrozny'03) (1)
49:50 Approach 2: Importance-Weighted Classi cation Approach (Zadrozny'03) (2)
55:29 Proof sketch: Fix x (graph)
55:50 Approach 2: Importance-Weighted Classi cation Approach (Zadrozny'03) (1)
55:55 Approach 3: The O set Trick for K = 2 (two actions) (1)
56:01 Approach 2: Importance-Weighted Classi cation Approach (Zadrozny'03) (2)
56:11 Approach 3: The O set Trick for K = 2 (two actions) (1)
58:43 Approach 3: The O set Trick for K = 2 (two actions) (2)
61:05 Induced binary distribution D (1)
62:20 Induced binary distribution D - Example 1 (1)
63:31 Induced binary distribution D - Example 1 (2)
69:42 Induced binary distribution D - Example 2 (1)
70:03 Induced binary distribution D - Example 2 (2)
71:14 Induced binary distribution D - Example 3 (1)
71:27 Induced binary distribution D - Example 3 (2)
72:40 Analysis for K = 2
73:07 Denoising for K > 2 arms
76:04 Training on example (x; 3; 0:75; 0:5) (1)
77:30 Training on example (x; 3; 0:75; 0:5) (2)
78:35 Training on example (x; 3; 0:75; 0:5) (3)
80:03 Denoising with K arms: Analysis
80:29 A Comparison of Approaches

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
 
    Delicious Bibliography

 Watch videos:   (click on thumbnail to launch)

Watch Part 1
Part 1 1:22:31
!NOW PLAYING
Watch Part 2
Part 2 1:02:18

Description

This tutorial is about learning through exploration. The goal is to learn how to make decisions in partial feedback settings where an agent repeatedly observes some information, chooses an action, and then learns how this action paid off (but doesn't get to see how other actions would have paid off). We plan to cover all aspects of this general problem: learning, evaluation, limitations of ability to learn in this setting, and the relationship to traditional supervised learning.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: