Regularization

Definition

Regularization

Regularization refers to a set of techniques used to prevent overfitting in machine learning models by adding a penalty to the loss function or modifying the learning process. The goal is to constrain the model’s complexity so that it generalizes better to unseen data.

Key Techniques

1. Norm Penalties (Weight Decay)

Add a term to the loss function $L$ that penalizes large weights:

L2 Regularization (Ridge): $L_{re g} = L + λ \sum w^{2}$ . Encourages small weights across all features.
L1 Regularization (Lasso): $L_{re g} = L + λ \sum ∣ w ∣$ . Encourages sparsity (some weights become exactly zero).

2. Dropout

Randomly “dropping out” (setting to zero) a fraction of neurons during each training step.

Purpose: Prevents neurons from co-adapting too much; forces the network to learn redundant representations.

3. Early Stopping

Monitoring performance on a validation set and stopping training once validation error starts to increase, even if training error is still decreasing.

4. Data Augmentation

Increasing the diversity of the training set by applying transformations (rotation, noise, cropping) to the data.

Why it works

Occam's Razor

In machine learning, a simpler model that explains the data is usually better than a complex one that fits the noise. Regularization effectively “pushes” the model towards simpler solutions (smaller weights, fewer active neurons) unless the data provides overwhelming evidence that complexity is necessary.

Connections

Prevents: Overfitting
Used in: Neural Networks, Linear Regression, SVMs
Interaction with: Gradient Descent

Study Notes

Explorer

Regularization

Regularization

Definition

Key Techniques

1. Norm Penalties (Weight Decay)

2. Dropout

3. Early Stopping

4. Data Augmentation

Why it works

Connections

Appears In

Graph View

Table of Contents

Backlinks