Skip to main content

Neural Network Architecture

Deep Learning is powered by Artificial Neural Networks (ANNs), which are heavily inspired by the biological structure of the human brain.

The Neuron (Perceptron)

The fundamental building block of a neural network is the Neuron (or Node). A single neuron takes in mathematical inputs, multiplies them by a specific Weight, adds a Bias, and passes the output through an Activation Function.

  • Weights: Determines the importance of a specific input.
  • Biases: Shifts the activation baseline left or right, giving the network extra flexibility to fit the data.
  • Activation Function: Introduces non-linearity to the network. Without activation functions like ReLU or Sigmoid, the entire neural network would collapse mechanically into a simple linear regression model, making it impossible to learn complex patterns like recognizing a cat in an image.

The Architectures

When you connect thousands of neurons together, you get a neural network.

Layers

  1. Input Layer: Receives the raw data (e.g., the raw pixels of an image).
  2. Hidden Layers: Intermediary processing layers where the actual "deep" learning occurs. A network is considered "Deep" if it has more than one Hidden Layer.
  3. Output Layer: Produces the final prediction (e.g., "Probability = 95% this image is a dog").

Fully Connected Networks (Dense Networks)

In a standard dense network, absolutely every single neuron in one layer connects to every single neuron in the next layer. This creates exponential parameter requirements, which is why Deep Learning is so computationally expensive!