Regret Bounds for the Adaptive Control of Linear Quadratic Systems

Published on 2011-08-023818 Views

Csaba Szepesvári

We study the average cost Linear Quadratic (LQ) problem with unknown model parameters, also known as the adaptive control problem in the control community. We design an algorithm and prove that its

COLT 2011 - Budapest

Related categories

Electrical Engineering

Presentation

Regret Bounds for the Adaptive Control of Linear Quadratic Systems00:00

Outline - 101:01

Control problems01:24

Learning to control02:20

Measure of performance of the learner02:48

This talk: Linear Quadratic Regulation04:00

The goal and why should we care?05:30

Previous works06:20

Outline - 207:46

The main ideas of the algorithm07:47

Estimation08:11

Optimism principle09:01

Avoiding frequent changes11:19

Outline - 311:36

How to choose the confidence set?11:47

Construction of confidence sets12:47

The algorithm13:10

Proof sketch13:21

Regret decomposition13:51

Term R114:34

Term R314:51

Term R215:04

Change the policy only when the determinant of confidence ellipsoid doubles.15:54

Theorem16:25

Conclusions16:39

References19:19