video thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning

Published on 2011-05-066152 Views

Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning. T

Related categories

Presentation

Reduction of Imitation Learning to No-Regret Online Learning00:00
Imitation Learning (1)00:14
Imitation Learning (2)00:54
Example Scenario01:20
Supervised Training Procedure02:04
Poor Performance in Practice02:45
# Mistakes Grows Quadratically in T!04:09
Reduction-Based Approach & Analysis05:29
Previous Work: Forward Training06:37
Previous Work: SMILe08:27
DAgger: Dataset Aggregation (1)09:53
DAgger: Dataset Aggregation (2)10:06
DAgger: Dataset Aggregation (3)10:09
DAgger: Dataset Aggregation (4)10:18
DAgger: Dataset Aggregation (5)10:50
DAgger: Dataset Aggregation (6)10:56
DAgger: Dataset Aggregation (7)11:01
DAgger: Dataset Aggregation (8)11:13
DAgger: Dataset Aggregation (9)11:22
Online Learning (1)11:55
Online Learning (2)12:14
Online Learning (3)12:21
Online Learning (4)12:24
Online Learning (5)12:28
Online Learning (6)12:54
DAgger as Online Learning (1)13:37
DAgger as Online Learning (2)14:08
DAgger as Online Learning (3)14:45
Theoretical Guarantees of DAgger (1)14:58
Theoretical Guarantees of DAgger (2)15:25
Theoretical Guarantees of DAgger (3)16:01
Theoretical Guarantees of DAgger (4)16:44
Experiments: 3D Racing Game17:11
DAggerTest-Time Execution17:26
Average Falls/Lap18:08
Experiments: Super Mario Bros18:30
Test-Time Execution (missing video)19:23
Average Distance/Stage20:18
Conclusion20:57
Questions21:53