The exploration and exploitation tradeoff: Strategy learning and queries thumbnail
slide-image
Pause
Mute
Subtitles not available
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

The exploration and exploitation tradeoff: Strategy learning and queries

Published on Feb 25, 20073698 Views

Chapter list

About exploration and exploitation, in the context of automata learning and active learning00:00
References and outline00:01
Outline00:04
My points00:15
1 Strategy learning?00:57
The problem:01:09
Hypothesis: the opponent follows a rational strategy (given by a DFA/Moore machine): 01:54
Typical example: the prisoner’s dilemma02:42
B02:54
Here an iterated version against an opponent that follows a rational strategy. Gain Function: limit of means. [here same as regret?] A game is a string in (His_moves  My_moves)*!03:02
The general problem03:55
Suppose we know the opponent’s strategy:04:13
a...04:25
Then04:52
a... 05:13
Question05:23
Question105:35
Question205:59
Question306:13
Question406:40
Question506:52
Question607:03
Question707:25
Question807:46
Question908:22
Question1008:46
Question1108:50
Question1208:54
Question1308:58
Question1409:02
Question1509:06
Question1609:10
An “open” problem09:30
How do we get hold of the learning data?10:28
Tit for Tat11:08
Grim11:29
But perhaps12:40
Question13:10
Mixed models13:17
Then use a prior (?) to determine if it is worth taking the risk to explore or not.14:44
My conclusion19:09
2 Active Learning: learning DFA from membership and equivalence queries: the L* algorithm20:47
2.1 About learning with queries21:14
The Oracle21:49
Membership queries.22:03
Equivalence (strong) queries.22:27
Correct learning22:47
2.2 Algorithm L*24:04
The Minimal Adequate Teacher24:40
General idea of L*24:48
An observation table25:36
The experiments (E)25:58
Meaning 026:02
Meaning 126:17
Equivalent prefixes26:24
Building a DFA from a table27:08
b27:18
Some rules27:42
An incomplete table28:05
Good idea28:51
A table is29:11
And a table that is not closed30:05
What do we do when we have a table that is not closed?30:12
An inconsistent table30:41
A table is consistent if31:25
What do we do when we have an inconsistent table?31:54
What do we do when we have a closed and consistent table ?31:59
What do we do if we get a counter-example?32:10
Run of the algorithm32:46
An equivalence query is made!33:26
Becouse of33:37
... a34:45
Some remarks and open questions35:58
What if the strategy is probabilistic?38:00
Why do all my queries cost the same?39:07
If membership queries have a price…39:41
Conclusion39:52
Classification results40:14