Training and Validation Loss in Deep Learning (original) (raw)

Last Updated : 27 Nov, 2025

Training loss measures how well the model learns from the training data during training. Validation loss shows how well the trained model performs on unseen data, helping detect overfitting.

Training Loss

Training Loss is a metric that measures how well a deep learning model is performing on the training dataset. During training the model makes predictions and compares them with the actual target values. The loss function then calculates the error between these predicted outputs and the true labels.

Training loss is computed after each forward pass and backward pass . The training loss can be expressed as:

\text{Loss} = \frac{1}{N} \sum_{i=1}^{N} L(y_i, \hat{y}_i)

Where:

A lower training loss means the model is learning well, whereas a high training loss often indicates underfitting or difficulty in learning patterns.

Validation Loss

Validation loss is a metric that evaluates a deep learning model’s performance on a validation dataset (set of data that the model has never seen during training). Validation loss is computed after each epoch during training

\text{Validation Loss} = \frac{1}{M} \sum_{i=1}^{M} L(y_i^{\text{val}}, \hat{y}_i^{\text{val}})

where

Importance of Monitoring Both Losses

Step-By-Step Implementation

Here we train a simple CNN on the Fashion MNIST dataset, monitor training and validation loss and plot the loss curves.

Step 1: Import Libraries

Here we will import TensorFlow, Keras and Matplotlib.

Python `

import tensorflow as tf from tensorflow.keras.datasets import fashion_mnist from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout from tensorflow.keras.utils import to_categorical import matplotlib.pyplot as plt

`

Step 2: Load and Preprocess Dataset

(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()

X_train = X_train.reshape(-1, 28, 28, 1) / 255.0 X_test = X_test.reshape(-1, 28, 28, 1) / 255.0

y_train = to_categorical(y_train, 10) y_test = to_categorical(y_test, 10)

`

Step 3: Build CNN Model

model = Sequential([ Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)), MaxPooling2D((2,2)), Conv2D(64, (3,3), activation='relu'), MaxPooling2D((2,2)), Flatten(), Dense(128, activation='relu'), Dropout(0.5), Dense(10, activation='softmax') ])

`

Step 4: Compile Model

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

`

Step 5: Train Model with Validation Split

history = model.fit(X_train, y_train, epochs=15, batch_size=64, validation_split=0.2)

`

**Output:

traning2

Traning

Step 6: Plot Training and Validation Loss

plt.figure(figsize=(8,5)) plt.plot(history.history['loss'], label='Training Loss') plt.plot(history.history['val_loss'], label='Validation Loss') plt.title('Training vs Validation Loss (CNN)') plt.xlabel('Epochs') plt.ylabel('Loss') plt.legend() plt.show()

`

**Output:

Traning1

Training vs Validation

The graph shows that both training and validation loss decrease steadily over epochs indicating effective learning. The gap between the curves remains small, showing the model is not overfitting and generalizes well to unseen data.

You can download full code from here.

Loss Functions for Model Training

  1. **Mean Squared Error****:** MSE measures the average squared difference between predicted and actual values in regression tasks.
  2. **Mean Absolute Error****:** MAE calculates the average absolute difference between predicted outputs and true values.
  3. **Cross-Entropy Loss****:** Cross-Entropy Loss measures how well the predicted probability distribution matches the true class labels in classification tasks.
  4. **Huber Loss****:** Huber Loss combines MSE and MAE to provide a balanced, outlier-resistant error measure.

Patterns in Loss Curves

**Factors That Affect Loss Values

Several factors influence how training and validation loss behave during model training.

Techniques to Reduce Training and Validation Loss