Deep Reinforcement Learning

Definition

Deep Reinforcement Learning (Deep RL)

Deep Reinforcement Learning is the field of RL that uses Deep Neural Networks as function approximators for value functions ( $V$ or $Q$ ) or policies ( $π$ ). This allows RL to scale to high-dimensional state spaces (like raw pixels) where Tabular RL is impossible due to the curse of dimensionality.

Challenges

Using deep learning in RL is notoriously unstable due to three main factors:

Non-stationarity: The target values for the neural network change as the policy improves.
Correlated Samples: Successive states in an episode are highly correlated, violating the IID assumption of SGD.
The Deadly Triad: The combination of Function Approximation, Bootstrapping (TD learning), and Off-Policy learning often leads to divergence.

Key Techniques & Solutions

Challenge	Solution	Algorithm Example
Correlated Samples	Experience Replay	DQN
Moving Targets	Target Networks	DQN
High Variance	Baselines / Critics	A3C, PPO
Overestimation	Double Learning	Double DQN

Primary Methods

Value-Based: Deep Q-Network (DQN) and its variants.
Policy-Based: REINFORCE with deep network policies.
Actor-Critic: A3C, A2C, PPO, SAC (Soft Actor-Critic).

Connections

Foundation: Reinforcement Learning, Neural Networks
Departure from: Tabular RL
Overcomes: Curse of Dimensionality
Risk: Deadly Triad

Study Notes

Explorer

Deep Reinforcement Learning

Deep Reinforcement Learning

Definition

Challenges

Key Techniques & Solutions

Primary Methods

Connections

Appears In

Graph View

Table of Contents

Backlinks