event thumbnail image
Principled methods of trading exploration and exploitation

The exploration and exploitation tradeoff: Strategy learning and queries

author: Colin de la Higuera, University Jean Monnet, St Etienne
You might be experiencing some problems with Your Video player.
Slides
0:00 About exploration and exploitation, in the context of automata learning and active learning
0:01 References and outline
0:04 Outline
0:15 My points
0:57 1 Strategy learning?
1:09 The problem:
1:54 Hypothesis: the opponent follows a rational strategy (given by a DFA/Moore machine):
2:42 Typical example: the prisoner’s dilemma
2:54 B
3:02 Here an iterated version against an opponent that follows a rational strategy. Gain Function: limit of means. [here same as regret?] A game is a string in (His_moves  My_moves)*!
3:55 The general problem
4:13 Suppose we know the opponent’s strategy:
4:25 a...
4:52 Then
5:13 a...
5:23 Question
5:35 Question1
5:59 Question2
6:13 Question3
6:40 Question4
6:52 Question5
7:03 Question6
7:25 Question7
7:46 Question8
8:22 Question9
8:46 Question10
8:50 Question11
8:54 Question12
8:58 Question13
9:02 Question14
9:06 Question15
9:10 Question16
9:30 An “open” problem
10:28 How do we get hold of the learning data?
11:08 Tit for Tat
11:29 Grim
12:40 But perhaps
13:10 Question
13:17 Mixed models
14:44 Then use a prior (?) to determine if it is worth taking the risk to explore or not.
19:09 My conclusion
20:47 2 Active Learning: learning DFA from membership and equivalence queries: the L* algorithm
21:14 2.1 About learning with queries
21:49 The Oracle
22:03 Membership queries.
22:27 Equivalence (strong) queries.
22:47 Correct learning
24:04 2.2 Algorithm L*
24:40 The Minimal Adequate Teacher
24:48 General idea of L*
25:36 An observation table
25:58 The experiments (E)
26:02 Meaning 0
26:17 Meaning 1
26:24 Equivalent prefixes
27:08 Building a DFA from a table
27:18 b
27:42 Some rules
28:05 An incomplete table
28:51 Good idea
29:11 A table is
30:05 And a table that is not closed
30:12 What do we do when we have a table that is not closed?
30:41 An inconsistent table
31:25 A table is consistent if
31:54 What do we do when we have an inconsistent table?
31:59 What do we do when we have a closed and consistent table ?
32:10 What do we do if we get a counter-example?
32:46 Run of the algorithm
33:26 An equivalence query is made!
33:37 Becouse of
34:45 ... a
35:58 Some remarks and open questions
38:00 What if the strategy is probabilistic?
39:07 Why do all my queries cost the same?
39:41 If membership queries have a price…
39:52 Conclusion
40:14 Classification results

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: