Practical RL: Representation, interaction, synthesis, and morality (PRISM) thumbnail
Pause
Mute
Subtitles not available
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Practical RL: Representation, interaction, synthesis, and morality (PRISM)

Published on Jul 28, 20152797 Views

When scaling up Reinforcement Learning (RL) to large continuous domains with imperfect representations and hierarchical structure, we often try applying algorithm that are proven to converge in small

Related categories

Chapter list

Practical RL: Representation, Interaction, Synthesis and Mortiality (PRISM)00:00
RL as a Tool - 100:42
RL as a Tool - 201:01
RL as a Tool - 301:18
RL as a Tool - 401:25
RL as a Tool - 501:31
RL as a Tool - 601:32
Practical RL - 101:42
Practical RL - 201:46
Practical RL - 302:06
Practical RL - 403:08
Practical RL - 503:40
Practical RL - 604:09
Practical RL - 704:17
Practical RL - 804:26
TEXPLORE: Real-Time Sample-Efficient Reinforcement Learning for Robots04:29
Reinforcement Learning - 104:42
Reinforcement Learning - 205:01
Mortality - 105:45
Mortality - 206:50
Mortality - 306:53
Fuel World06:59
Fuel World Behavior07:39
Velocity Control: Real-Time Need08:28
Velocity Control08:56
Desiderata - 109:35
Desiderata - 209:57
Common Approaches - 110:02
Common Approaches - 210:06
The TEXPLORE Algorithm10:13
Challenge 1: Sample Efficiency - 110:22
Challenge 1: Sample Efficiency - 210:34
Challenge 1: Sample Efficiency - 310:39
Challenge 1: Sample Efficiency - 410:47
Random Forest Model - 110:51
Random Forest Model - 211:19
Challenge 2: Real-Time Action Selection12:47
Real-Time Model Based Architecture (RTMBA) - 113:03
Real-Time Model Based Architecture (RTMBA) - 213:25
Real-Time Model Based Architecture (RTMBA) - 313:25
Real-Time Model Based Architecture (RTMBA) - 413:30
Challenge 3: Continuous State - 113:53
Challenge 3: Continuous State - 214:00
Challenge 4: Actuator Delays - 114:10
Challenge 4: Actuator Delays - 214:13
Challenge 4: Actuator Delays - 314:34
Autonomous Vehicle14:38
Uses ROS [Quigley et al 2009]14:46
Simulation Experiments - 114:54
Simulation Experiments - 215:12
Challenge 1: Sample Efficiency15:14
Challenge 2: Real-Time Action Selection15:22
Challenge 3: Modeling Continuous Domains15:23
Challenge 4: Handling Delayed Actions15:23
On the physical vehicle - 115:24
On the physical vehicle - 215:30
TEXPLORE Summary16:05
Practical RL16:51
UT Austin Villa 201417:27
UT Austin Villa 2014 - 117:40
UT Austin Villa 2014 - 218:23
UT Austin Villa 2014 - 318:28
UT Austin Villa 2014 - 418:32
Layered Learning in Practice - 118:45
Layered Learning in Practice - 220:05
Layered Learning Paradigms - 121:05
Layered Learning Paradigms - 221:41
Layered Learning Paradigms - 322:00
Overlapping Layered Learning - 122:18
Overlapping Layered Learning - 222:22
Overlapping Layered Learning - 322:34
Overlapping Layered Learning - 422:42
RoboCup 3D Simulation Domain22:52
RoboCup Champions 2011, 2012 - 123:23
RoboCup Champions 2011, 2012 - 223:52
RoboCup Champions 2011, 2012 - 324:38
RoboCup Champions 2011, 2012 - 424:50
RoboCup Champions 2011, 2012 - 528:02
Learned Layers28:08
Dribbling and Kicking the Ball in the Goal29:01
Scoring on a Kickoff30:21
Impact of Overlapping Layered Learning - 131:36
Impact of Overlapping Layered Learning - 231:46
Repetition on Different Robot Types - 132:07
Repetition on Different Robot Types - 232:24
Repetition on Different Robot Types - 332:30
RoboCup 2014 - 133:03
RoboCup 2014 - 234:01
RoboCup 2014 - 334:02
RoboCup 2014 - 434:03
Practical RL - 134:04
Making Friends on the Fly: Advances in Ad Hoc Teamwork34:19
Ad Hoc Teamwork - 134:23
Ad Hoc Teamwork - 234:35
PLASTIC34:37
Testbed Domains35:16
Practical RL - 235:50
Practical RL - 435:57
Practical RL - 536:15