Partial Observability

Partial Observability

A setting where the agent cannot directly observe the true state $x_{t}$ of the environment. Instead, it receives observations $o_{t}$ that provide incomplete or noisy information about the underlying state. The standard MDP assumption of full state access is violated.

Intuition

Seeing Through a Keyhole

Full observability is like playing chess — you can see the entire board. Partial observability is like playing poker — you can only see your own cards. Your decisions must account for uncertainty about what you can’t see.

Handling Partial Observability

Three approaches (from least to most approximate):

Approach	Idea	Requirements	Limitations
Belief State	Probability distribution over hidden states	Full model (transitions, observations)	Discrete states only, model needed
Predictive State Representation	Predictions about future observations	Core tests	Tabular setting
Approximate	Use recent observations as state	Nothing extra	Not optimal, no guarantees

Approximate Methods in Practice

Single observation: $S_{t} = O_{t}$ — simplest, often “good enough”
Frame stacking: $S_{t} = (O_{t - k}, A_{t - k}, \dots, O_{t})$ — used in Atari DQN (4 frames)
Recurrent networks: Deep Recurrent Q-Learning — LSTM maintains internal memory

Practical Reality

In practice, many successful RL systems simply treat observations as states ( $S = O$ ). With function approximation, there’s typically no guarantee that the features define a Markov state anyway. As long as the system is “close enough” to Markov, this can work well enough.

Connections

Formalized by POMDP
Generalizes Markov Decision Process (MDP is a special case with $O = X$ )
Belief State and Predictive State Representation are exact approaches
Deep Recurrent Q-Learning is an approximate deep learning approach

Appears In

RL-L13 - Partial Observability
RL-Book Ch17 - Frontiers (§17.3)

Study Notes

Explorer

Partial Observability

Partial Observability

Intuition

Handling Partial Observability

Approximate Methods in Practice

Connections

Appears In

Graph View

Table of Contents

Backlinks