A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning

Published on May 06, 20116141 Views

Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning. T

Related categories

Chapter list

Reduction of Imitation Learning to No-Regret Online Learning00:00
Imitation Learning (1)00:14
Imitation Learning (2)00:54
Example Scenario01:20
Supervised Training Procedure02:04
Poor Performance in Practice02:45
# Mistakes Grows Quadratically in T!04:09
Reduction-Based Approach & Analysis05:29
Previous Work: Forward Training06:37
Previous Work: SMILe08:27
DAgger: Dataset Aggregation (1)09:53
DAgger: Dataset Aggregation (2)10:06
DAgger: Dataset Aggregation (3)10:09
DAgger: Dataset Aggregation (4)10:18
DAgger: Dataset Aggregation (5)10:50
DAgger: Dataset Aggregation (6)10:56
DAgger: Dataset Aggregation (7)11:01
DAgger: Dataset Aggregation (8)11:13
DAgger: Dataset Aggregation (9)11:22
Online Learning (1)11:55
Online Learning (2)12:14
Online Learning (3)12:21
Online Learning (4)12:24
Online Learning (5)12:28
Online Learning (6)12:54
DAgger as Online Learning (1)13:37
DAgger as Online Learning (2)14:08
DAgger as Online Learning (3)14:45
Theoretical Guarantees of DAgger (1)14:58
Theoretical Guarantees of DAgger (2)15:25
Theoretical Guarantees of DAgger (3)16:01
Theoretical Guarantees of DAgger (4)16:44
Experiments: 3D Racing Game17:11
DAggerTest-Time Execution17:26
Average Falls/Lap18:08
Experiments: Super Mario Bros18:30
Test-Time Execution (missing video)19:23
Average Distance/Stage20:18
Conclusion20:57
Questions21:53