Mean Squared Value Error

Definition

Mean Squared Value Error (MSVE)

In RL with function approximation, we cannot represent the true value function $v_{π}$ exactly for all states. The Mean Squared Value Error (MSVE) is the standard objective function used to measure how well our approximate value function $\overset{v}{^} (s, w)$ matches the true value function $v_{π} (s)$ .

Mathematical Formulation

MSVE

$\overline{V E} (w) = \sum_{s \in S} μ (s) [v_{π} (s) - \overset{v}{^} (s, w)]^{2}$

where:

$w$ — Learnable weights of the function approximator

$v_{π} (s)$ — True value of state $s$ under policy $π$

$\overset{v}{^} (s, w)$ — Approximate value (e.g., from a neural network or linear combination)

$μ (s)$ — State distribution, usually the on-policy distribution (stationary distribution under $π$ ). It weights the error by how often the agent actually visits state $s$ .

Why We Need This

Trade-offs in Approximation

With function approximation, we have fewer parameters than states ( $d ≪ ∣ S ∣$ ). This means improving the accuracy in one state usually makes it worse in another. The MSVE tells us which states are more important to get “right” based on the distribution $μ (s)$ . We accept more error in rarely visited states to achieve lower error in frequently visited ones.

Key Properties

Objective: Algorithms like Gradient Descent minimize this error by calculating $\nabla \overline{V E} (w)$ .
The Ideal Goal: In the tabular case, $\overline{V E} = 0$ . In approximation, we seek a global (or local) minimum.
Challenge: In RL, we don’t actually know the true $v_{π} (s)$ (the “target”). Algorithms like TD replace it with a bootstrapped estimate, which changes the optimization landscape.

Connections

Used for: Function Approximation
Optimized via: Stochastic Gradient Descent (SGD)
Relies on: State Space distribution $μ$

Study Notes

Explorer

Mean Squared Value Error

Mean Squared Value Error

Definition

Mathematical Formulation

Why We Need This

Key Properties

Connections

Appears In

Graph View

Table of Contents

Backlinks