Bellman Error
Bellman Error
The Bellman error at a state measures how far the current value estimate is from satisfying the Bellman Equation:
Mean Squared Bellman Error ( )
Bellman Error Is Not Learnable
A key result from Ch 11.6: the cannot be learned from data alone — different MDPs can produce identical data but have different values. This is why gradient methods that minimize directly are problematic.
Alternative objectives: Projected Bellman Error (PBE), Mean Squared TD Error — these are learnable and used by Gradient-TD Methods.