en
0.25
0.5
0.75
1.25
1.5
1.75
2
How to discount Information: Information flow in sensing-acting systems and the emergence of heirarchies
Published on Oct 16, 20124029 Views
We argue that consistent formulation of optimal sensing and control must include information terms, yielding an extension of the standard POMDP setting. To make the standard reward/costs terms consist
Related categories
Chapter list
How to discount Information00:00
The Perception - Action Cycle01:08
Perception - Action Cycles07:16
The essence of the cycle08:44
The brain’s primary task: making valuable predictions12:23
Hierarchies and reverse hierarchies13:27
We see what we expect to see17:55
Key assumptions18:28
Outline18:29
Part 122:29
Markov Decision Processes22:36
Reinforcement Learning revisited24:26
The Agent Learns a Policy24:52
Policy Iteration26:33
Information gathering26:49
Bellman meets Shannon26:53
Graphical model for the perception-action-cycle (1)27:04
Graphical model for the perception-action-cycle (2)31:41
Decision-sequences and information (1)34:35
Decision-sequences and information (2)36:08
Proof idea37:42
Application44:15
Simple example: The 3 coin problem45:12
Huffman coding and Bellman optimality46:34
How much information is needed?46:35
Value (extrinsic) and Information (intrinsic)…46:36
Some Control-Information Dualities48:08
Combining (future) Value and Information48:10
Trading Value and (future) Information48:13
Free Energy minimization50:19
Simple sum-rules50:20
Graphs54:45
Global convergence theorem56:52
PAC-Bayes Generalization Theorem57:31
PAC-Bayes Robustness Theorem for I-RL (1)57:33
PAC-Bayes Robustness Theorem for I-RL (2)57:46
Moving through a mind field (1)57:47
Moving through a mind field (2)57:50