Convolutional Neural Networks (CNNs)

Convolutional Neural Network

A neural network architecture that uses convolutional layers with learnable filters to automatically extract local spatial features from grid-structured data (images, 1D sequences). Weight sharing across spatial positions makes CNNs translation-invariant and parameter-efficient.

Core Components

Convolutional layer: Applies learnable filters (kernels) across input via sliding window → produces feature maps
Pooling layer: Downsamples feature maps (max-pool, average-pool) → reduces spatial dimensions, adds invariance
Fully connected layer: Final classification/regression after feature extraction

Key Properties

Translation invariance: same filter applied everywhere → detects features regardless of position
Weight sharing: far fewer parameters than fully connected equivalent
Hierarchical features: early layers detect edges/textures, deeper layers detect complex patterns
Local connectivity: each neuron connects to a small receptive field, not the full input

In RL Context

CNNs serve as the feature extraction backbone for Deep Reinforcement Learning on visual inputs:

Deep Q-Network (DQN) uses CNNs to process raw Atari game frames
The CNN maps pixels → learned state representation → fed to value/policy heads

In IR Context

Used in some early neural IR models for learning text representations from character/word n-grams
Largely superseded by Transformers in modern Neural Reranking and Dense Retrieval

Connections

Component of Deep Q-Network (DQN) and Deep Reinforcement Learning
Neural Network Function Approximation in RL
Compared with Transformers (self-attention vs local convolution)

Appears In

RL-L08 - Deep RL Value-Based (DQN architecture)
RL-Book Ch16 - Applications and Case Studies (Atari)

Study Notes

Explorer

Convolutional Neural Networks

Convolutional Neural Networks (CNNs)

Core Components

Key Properties

In RL Context

In IR Context

Connections

Appears In

Graph View

Table of Contents

Backlinks