Neural Networks

Definition

Neural Network

A Neural Network is a parameterized function approximator composed of layers of linear transformations followed by non-linear activation functions. They are designed to learn complex, non-linear mappings from inputs to outputs using the principles of gradient-based optimization.

Mathematical Formulation

A single hidden layer network can be represented as: $y = σ_{2} (W_{2} σ_{1} (W_{1} x + b_{1}) + b_{2})$

where:

$W_{i}, b_{i}$ — Weight matrix and bias vector of layer $i$
$σ_{i}$ — Non-linear activation functions (e.g., ReLU, Sigmoid, Tanh)
$x$ — Input vector

Universal Approximation Theorem

Universal Approximation

A feedforward network with a single hidden layer and a finite number of neurons can approximate any continuous function on compact subsets of $R^{n}$ , provided the activation function is non-constant, bounded, and monotonically-increasing.

Training (Backpropagation)

Neural networks are trained using Stochastic Gradient Descent (SGD) or variants like Adam. The gradients are calculated using Backpropagation, which is an application of the Chain Rule of calculus to compute the partial derivatives of a loss function $L$ with respect to every weight in the network.

Key Concepts

Activation Functions: Introduce non-linearity (e.g., $R e LU (z) = max (0, z)$ ).
Layers: Layers between input and output are “hidden layers.” Deep networks have many.
Weights: The parameters $θ$ that the model “learns.”

Connections

Used in: Deep Reinforcement Learning
Optimization: Gradient Descent, Adam, RMSProp
Prevention of Overfitting: Regularization
Variant: Convolutional Neural Networks (for vision), Transformers (for sequence)

Study Notes

Explorer

Neural Networks

Neural Networks

Definition

Mathematical Formulation

Universal Approximation Theorem

Training (Backpropagation)

Key Concepts

Connections

Appears In

Graph View

Table of Contents

Backlinks