How to discount Information: Information flow in sensing-acting systems and the emergence of heirarchies

We argue that consistent formulation of optimal sensing and control must include information terms, yielding an extension of the standard POMDP setting. To make the standard reward/costs terms consistent with the information terms, while still allowing tractable computation, the standard uniformity of time must be altered. We argue that this can be done by successive refinement of the information-value tradeoff, which also leads to the emergence of hierarchies and reverse-hierarchies for both perception and planning.