Backpropagation in Neural Network (original) (raw)

Last Updated : 12 May, 2026

Backpropagation is an algorithm that trains neural networks by reducing prediction error. It works by propagating errors backward, computing gradients using the chain rule, and updating weights and biases to improve performance.

Backpropagation-in-Neural-Network-1

Fig(a) A simple illustration of how the backpropagation works by adjustments of weights

Working of Back Propagation Algorithm

The Back Propagation algorithm involves two main steps:

1. Forward Pass Work

In forward pass, input data moves through the network to generate an output.

Backpropagation-in-Neural-Network-2

The forward pass using weights and biases

2. Backward Pass

In this step, the error between predicted and actual output is propagated backward to update weights and biases. Error is calculated (e.g., using Mean Squared Error)

\text{MSE} = (\text{Predicted Output} - \text{Actual Output})^2

Example

Let’s walk through an example of Back Propagation in machine learning. Assume the neurons use the sigmoid activation function for the forward and backward pass. The target output is 0.5 and the learning rate is 1.

Backpropagation-in-Neural-Network-3

Example (1) of backpropagation sum

Forward Propagation

1. Initial Calculation

The weighted sum at each node is calculated using:

a j​ =∑(w i​ ,j∗x i​ )

Where,

**O (output): After applying the activation function to a, we get the output of the neuron:

o_j = activation function(a_j )

2. Sigmoid Function

The sigmoid function returns a value between 0 and 1, introducing non-linearity into the model.

y_j = \frac{1}{1+e^{-a_j}}

Backpropagation-in-Neural-Network-4

To find the outputs of y3, y4 and y5

3. Computing Outputs

At h1 node

\begin {aligned}a_1 &= (w_{1,1} x_1) + (w_{2,1} x_2) \\& = (0.2 * 0.35) + (0.2* 0.7)\\&= 0.21\end {aligned}

Once we calculated the a1 value, we can now proceed to find the y3 value:

y_j= F(a_j) = \frac 1 {1+e^{-a_1}}

y_3 = F(0.21) = \frac 1 {1+e^{-0.21}}

y_3 = 0.56

Similarly find the values of y4 at **h 2 and y5 at O3

a_2 = (w_{1,2} * x_1) + (w_{2,2} * x_2) = (0.3*0.35)+(0.3*0.7)=0.315

y_4 = F(0.315) = \frac 1{1+e^{-0.315}}

a3 = (w_{1,3}*y_3)+(w_{2,3}*y_4) =(0.3*0.57)+(0.9*0.59) =0.702

y_5 = F(0.702) = \frac 1 {1+e^{-0.702} } = 0.67

Backpropagation-in-Neural-Network-5

Values of y3, y4 and y5

4. Error Calculation

Our actual output is 0.5 but we obtained 0.67__._ To calculate the error we can use the below formula:

Error_j= y_{target} - y_5

\text{=>} \space 0.5 - 0.67 = -0.17

Using this error value we will be backpropagating.

**Back Propagation

**1. Calculating Gradients

The change in each weight is calculated as:

\Delta w_{ij} = \eta \times \delta_j \times O_j

Where:

**2. Output Unit Error

For O3:

\delta_5 = y_5(1-y_5) (y_{target} - y_5)

= 0.67(1-0.67)(-0.17) = -0.0376

**3. Hidden Unit Error

For h1:

\delta_3 = y_3 (1-y_3)(w_{1,3} \times \delta_5)

= 0.56(1-0.56)(0.3 \times -0.0376) = -0.0027

For h2:

\delta_4 = y_4(1-y_4)(w_{2,3} \times \delta_5)

=0.59 (1-0.59)(0.9 \times -0.0376) = -0.00819

4. Weight Updates

For the weights from hidden to output layer:

\Delta w_{2,3} = 1 \times (-0.0376) \times 0.59 = -0.022184

New weight:

w_{2,3}(\text{new}) = -0.022184 + 0.9 = 0.877816

For weights from input to hidden layer:

\Delta w_{1,1} = 1 \times (-0.0027) \times 0.35 = -0.000945

New weight:

w_{1,1}(\text{new}) = - 0.000945 + 0.2 = 0.199055

Similarly other weights are updated:

The updated weights are illustrated below

backpropagation_in_neural_network_11

Through backward pass the weights are updated

After updating the weights the forward pass is repeated hence giving:

Since y_5 = 0.61 is still not the target output the process of calculating the error and backpropagating continues until the desired output is reached.

This process demonstrates how Back Propagation iteratively updates weights by minimizing errors until the network accurately predicts the output.

Error = y_{target} - y_5

= 0.5 - 0.61 = -0.11

This process is said to be continued until the actual output is gained by the neural network.

Back Propagation Implementation in Python for XOR Problem

This code demonstrates how Back Propagation is used in a neural network to solve the XOR problem. The neural network consists of:

1. Defining Neural Network

The network consists of 2 input neurons, 4 hidden neurons and 1 output neuron, using the sigmoid activation function.

import numpy as np

class NeuralNetwork: def init(self, input_size, hidden_size, output_size): self.input_size = input_size self.hidden_size = hidden_size self.output_size = output_size

    self.weights_input_hidden = np.random.randn(
        self.input_size, self.hidden_size)
    self.weights_hidden_output = np.random.randn(
        self.hidden_size, self.output_size)

    self.bias_hidden = np.zeros((1, self.hidden_size))
    self.bias_output = np.zeros((1, self.output_size))

def sigmoid(self, x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(self, x):
    return x * (1 - x)

`

2. Defining Feed Forward Network

In Forward pass inputs are passed through the network activating the hidden and output layers using the sigmoid function.

def feedforward(self, X): self.hidden_activation = np.dot( X, self.weights_input_hidden) + self.bias_hidden self.hidden_output = self.sigmoid(self.hidden_activation)

self.output_activation = np.dot(
    self.hidden_output, self.weights_hidden_output) + self.bias_output
self.predicted_output = self.sigmoid(self.output_activation)

return self.predicted_output

`

3. Defining Backward Network

In the backward pass, the network calculates the error and updates weights using gradients from the sigmoid function.

def backward(self, X, y, learning_rate): output_error = y - self.predicted_output output_delta = output_error *
self.sigmoid_derivative(self.predicted_output)

hidden_error = np.dot(output_delta, self.weights_hidden_output.T)
hidden_delta = hidden_error * self.sigmoid_derivative(self.hidden_output)

self.weights_hidden_output += np.dot(self.hidden_output.T,
                                     output_delta) * learning_rate
self.bias_output += np.sum(output_delta, axis=0,
                           keepdims=True) * learning_rate
self.weights_input_hidden += np.dot(X.T, hidden_delta) * learning_rate
self.bias_hidden += np.sum(hidden_delta, axis=0,
                           keepdims=True) * learning_rate

`

4. Training Network

The network is trained over 10,000 epochs using the Back Propagation algorithm with a learning rate of 0.1 progressively reducing the error.

def train(self, X, y, epochs, learning_rate): for epoch in range(epochs): output = self.feedforward(X) self.backward(X, y, learning_rate) if epoch % 4000 == 0: loss = np.mean(np.square(y - output)) print(f"Epoch {epoch}, Loss:{loss}")

`

5. Testing Neural Network

X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]]) y = np.array([[0], [1], [1], [0]])

nn = NeuralNetwork(input_size=2, hidden_size=4, output_size=1) nn.train(X, y, epochs=10000, learning_rate=0.1)

output = nn.feedforward(X) print("Predictions after training:") print(output)

`

**Output:

Screenshot-2025-03-07-130223

Trained Model

Download full code from here

Advantages

Challenges