Training Neural Networks

Training a neural network is the mathematical process of figuring out the perfect combination of Weights and Biases so the network makes correct predictions.

The Training Loop

Training happens in iterations (called Epochs). Every iteration involves three major steps: Forward Propagation, measuring the Loss Function, and Backpropagation.

1. Forward Propagation

The raw data is fed into the input layer. It passes forward through all the hidden layers, being mathematically scrambled by the weights, biases, and activation functions, until a final prediction pops out of the output layer.

2. The Loss Function

Initially, the network's weights are completely random, so its prediction will be terrible. The Loss Function purely calculates how wrong the prediction was compared to the actual correct answer (the label).

If the model predicted 0.2 and the actual answer was 1.0, the Loss is high.
The ultimate goal of training is to minimize this Loss to as close to 0 as possible.

3. Backpropagation and Gradient Descent

This is the magic of Deep Learning.

Backpropagation: The algorithm runs backward through the network, using the Chain Rule from calculus to calculate exactly how much each individual weight contributed to the error.
Gradient Descent: The optimizer math updates all the weights by a tiny fraction in the opposite direction of the error.

This loop repeats thousands or millions of times until the network converges on the lowest possible loss. At that point, the model is successfully "Trained".

The Training Loop​

1. Forward Propagation​

2. The Loss Function​

3. Backpropagation and Gradient Descent​

The Training Loop

1. Forward Propagation

2. The Loss Function

3. Backpropagation and Gradient Descent