Study Notes

Home

❯

Concepts

❯

Bellman Optimality Equation

Bellman Optimality Equation

Feb 24, 20261 min read

  • foundations
  • key-formula

Bellman Optimality Equation

See Bellman Optimality Equations for the full treatment.

For v∗​

v∗​(s)=maxa​∑s′,r​p(s′,r∣s,a)[r+γv∗​(s′)]

For q∗​

q∗​(s,a)=∑s′,r​p(s′,r∣s,a)[r+γmaxa′​q∗​(s′,a′)]

Solved by: Value Iteration, Policy Iteration, Q-Learning (sample-based)

Appears In

  • RL-L01 - Intro, MDPs & Bandits, RL-L02 - Dynamic Programming

Graph View

  • Bellman Optimality Equation
  • Appears In

Backlinks

  • Bellman Equation
  • Dynamic Programming
  • Markov Decision Process
  • Value Iteration

Created with Quartz v4.5.2 © 2026

  • GitHub
  • Discord Community