Least-Squares TD (LSTD)
LSTD
LSTD computes the TD Fixed Point directly in closed form rather than iteratively. For Linear Function Approximation, the TD fixed point satisfies , and LSTD solves for exactly.
LSTD Solution
where:
Updated incrementally using Sherman-Morrison:
Trade-offs
| Property | LSTD | Semi-gradient TD |
|---|---|---|
| Step-size ? | No (direct solution) | Yes (sensitive to tuning) |
| Convergence speed | Faster (data-efficient) | Slower (iterative) |
| Computation per step | ||
| Memory | (stores ) |
When to Use LSTD
LSTD is ideal when: data is expensive (few samples), feature dimension is moderate, and you want to avoid tuning . Not practical for very high-dimensional feature spaces.
Connections
- Solves: TD Fixed Point
- For: Linear Function Approximation only
- Alternative to: Semi-gradient TD
- Extended by: LSTD(λ), LSPI (for control)