Table of Contents
Backpropagation is a fundamental algorithm used to train deep neural networks. It allows the network to learn by adjusting weights based on the error between predicted and actual outputs. Understanding its theory and implementation is essential for developing effective machine learning models.
Theory of Backpropagation
Backpropagation involves computing the gradient of the loss function with respect to each weight in the network. This process uses the chain rule of calculus to propagate errors backward from the output layer to the input layer. The gradients are then used to update the weights to minimize the loss.
Steps in Backpropagation
- Forward pass: Calculate the output of the network for a given input.
- Compute loss: Measure the difference between predicted and actual values.
- Backward pass: Propagate the error backward to compute gradients.
- Update weights: Adjust weights using the gradients and a learning rate.
Implementing Backpropagation in Code
Implementing backpropagation requires defining functions for forward propagation, loss calculation, and gradient computation. Typically, frameworks like TensorFlow or PyTorch automate much of this process, but understanding the underlying code helps in customizing training routines.
Sample Code Snippet
Below is a simplified example of backpropagation in Python for a single-layer neural network:
Note: This example is for educational purposes and omits many practical considerations.
“`python
import numpy as np
# Initialize weights and data
w = np.random.randn(2, 1)
X = np.array([[0.5, -0.2], [0.3, 0.8]])
y = np.array([[1], [0]])
learning_rate = 0.1
for epoch in range(1000):
# Forward pass
z = np.dot(X, w)
a = 1 / (1 + np.exp(-z)) # Sigmoid activation
# Compute loss (binary cross-entropy)
loss = -np.mean(y * np.log(a) + (1 – y) * np.log(1 – a))
# Backward pass
dz = a – y
dw = np.dot(X.T, dz) / X.shape[0]
# Update weights
w -= learning_rate * dw
“`