Convolutional Neural Networks (CNNs)

Convolutional Neural Network

A neural network architecture that uses convolutional layers with learnable filters to automatically extract local spatial features from grid-structured data (images, 1D sequences). Weight sharing across spatial positions makes CNNs translation-invariant and parameter-efficient.

Core Components

  1. Convolutional layer: Applies learnable filters (kernels) across input via sliding window → produces feature maps
  2. Pooling layer: Downsamples feature maps (max-pool, average-pool) → reduces spatial dimensions, adds invariance
  3. Fully connected layer: Final classification/regression after feature extraction

Key Properties

  • Translation invariance: same filter applied everywhere → detects features regardless of position
  • Weight sharing: far fewer parameters than fully connected equivalent
  • Hierarchical features: early layers detect edges/textures, deeper layers detect complex patterns
  • Local connectivity: each neuron connects to a small receptive field, not the full input

In RL Context

CNNs serve as the feature extraction backbone for Deep Reinforcement Learning on visual inputs:

  • Deep Q-Network (DQN) uses CNNs to process raw Atari game frames
  • The CNN maps pixels → learned state representation → fed to value/policy heads

In IR Context

Connections

Appears In