en-de
en-es
en-fr
en-pt
en-sl
en
en-zh
0.25
0.5
0.75
1.25
1.5
1.75
2
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
Published on May 06, 20116137 Views
Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning. T
Related categories
Chapter list
Reduction of Imitation Learning to No-Regret Online Learning00:00
Imitation Learning (1)00:14
Imitation Learning (2)00:54
Example Scenario01:20
Supervised Training Procedure02:04
Poor Performance in Practice02:45
# Mistakes Grows Quadratically in T!04:09
Reduction-Based Approach & Analysis05:29
Previous Work: Forward Training06:37
Previous Work: SMILe08:27
DAgger: Dataset Aggregation (1)09:53
DAgger: Dataset Aggregation (2)10:06
DAgger: Dataset Aggregation (3)10:09
DAgger: Dataset Aggregation (4)10:18
DAgger: Dataset Aggregation (5)10:50
DAgger: Dataset Aggregation (6)10:56
DAgger: Dataset Aggregation (7)11:01
DAgger: Dataset Aggregation (8)11:13
DAgger: Dataset Aggregation (9)11:22
Online Learning (1)11:55
Online Learning (2)12:14
Online Learning (3)12:21
Online Learning (4)12:24
Online Learning (5)12:28
Online Learning (6)12:54
DAgger as Online Learning (1)13:37
DAgger as Online Learning (2)14:08
DAgger as Online Learning (3)14:45
Theoretical Guarantees of DAgger (1)14:58
Theoretical Guarantees of DAgger (2)15:25
Theoretical Guarantees of DAgger (3)16:01
Theoretical Guarantees of DAgger (4)16:44
Experiments: 3D Racing Game17:11
DAggerTest-Time Execution17:26
Average Falls/Lap18:08
Experiments: Super Mario Bros18:30
Test-Time Execution (missing video)19:23
Average Distance/Stage20:18
Conclusion20:57
Questions21:53