Deep Reinforcement Learning

Definition

Deep Reinforcement Learning (Deep RL)

Deep Reinforcement Learning is the field of RL that uses Deep Neural Networks as function approximators for value functions ( or ) or policies (). This allows RL to scale to high-dimensional state spaces (like raw pixels) where Tabular RL is impossible due to the curse of dimensionality.

Challenges

Using deep learning in RL is notoriously unstable due to three main factors:

  1. Non-stationarity: The target values for the neural network change as the policy improves.
  2. Correlated Samples: Successive states in an episode are highly correlated, violating the IID assumption of SGD.
  3. The Deadly Triad: The combination of Function Approximation, Bootstrapping (TD learning), and Off-Policy learning often leads to divergence.

Key Techniques & Solutions

ChallengeSolutionAlgorithm Example
Correlated SamplesExperience ReplayDQN
Moving TargetsTarget NetworksDQN
High VarianceBaselines / CriticsA3C, PPO
OverestimationDouble LearningDouble DQN

Primary Methods

Connections

Appears In