video thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

How to discount Information: Information flow in sensing-acting systems and the emergence of heirarchies

Published on 2012-10-164036 Views

We argue that consistent formulation of optimal sensing and control must include information terms, yielding an extension of the standard POMDP setting. To make the standard reward/costs terms consist

Related categories

Presentation

How to discount Information00:00
The Perception - Action Cycle01:08
Perception - Action Cycles07:16
The essence of the cycle08:44
The brain’s primary task: making valuable predictions12:23
Hierarchies and reverse hierarchies13:27
We see what we expect to see17:55
Key assumptions18:28
Outline18:29
Part 122:29
Markov Decision Processes22:36
Reinforcement Learning revisited24:26
The Agent Learns a Policy24:52
Policy Iteration26:33
Information gathering26:49
Bellman meets Shannon26:53
Graphical model for the perception-action-cycle (1)27:04
Graphical model for the perception-action-cycle (2)31:41
Decision-sequences and information (1)34:35
Decision-sequences and information (2)36:08
Proof idea37:42
Application44:15
Simple example: The 3 coin problem45:12
Huffman coding and Bellman optimality46:34
How much information is needed?46:35
Value (extrinsic) and Information (intrinsic)…46:36
Some Control-Information Dualities48:08
Combining (future) Value and Information48:10
Trading Value and (future) Information48:13
Free Energy minimization50:19
Simple sum-rules50:20
Graphs54:45
Global convergence theorem56:52
PAC-Bayes Generalization Theorem57:31
PAC-Bayes Robustness Theorem for I-RL (1)57:33
PAC-Bayes Robustness Theorem for I-RL (2)57:46
Moving through a mind field (1)57:47
Moving through a mind field (2)57:50