MNIST Dataset : Practical Applications Using Keras and PyTorch (original) (raw)

Last Updated : 7 Feb, 2026

The MNIST dataset is a widely used benchmark in machine learning for handwritten digit recognition. It contains preprocessed handwritten digit images derived from the original NIST dataset, making it suitable for research and experimentation.

How the MNIST Dataset Was Created

MNIST in Machine Learning

The MNIST dataset holds significant value in the field of machine learning for multiple reasons:

Methods to load MNIST dataset in Python

Loading the MNIST dataset in Python can be done in several ways, depending on the libraries and tools you prefer to use. Below are some of the most common methods to load the MNIST dataset using different Python libraries:

Loading MNIST dataset using TensorFlow/Keras

This code shows how to loads the MNIST dataset using TensorFlow/Keras, normalizes the images, prints dataset shapes, and displays the first four training images with their labels.

from tensorflow.keras.datasets import mnist import matplotlib.pyplot as plt import numpy as np

(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train / 255.0 X_test = X_test / 255.0

print("Training data shape:", X_train.shape) print("Testing data shape:", X_test.shape)

plt.figure(figsize=(10, 3)) for i in range(4): plt.subplot(1, 4, i + 1) plt.imshow(X_train[i], cmap="gray") plt.title(f"Label: {y_train[i]}") plt.axis("off")

plt.tight_layout() plt.show()

`

**Output:

mnist

Sample Images from the MNIST Training Dataset

Loading MNIST dataset Using PyTorch

This code shows how to load the MNIST handwritten digit dataset using PyTorch and visualize a few sample images. It helps in understanding how images and labels are accessed through a DataLoader before training a model.

import matplotlib.pyplot as plt import torch from torchvision import datasets, transforms from torch.utils.data import DataLoader

def load_mnist(batch_size=5): transform = transforms.ToTensor() dataset = datasets.MNIST( root="./data", train=True, download=True, transform=transform ) return DataLoader(dataset, batch_size=batch_size, shuffle=True)

def visualize_samples(dataloader, num_samples=5): images, labels = next(iter(dataloader))

plt.figure(figsize=(15, 3))
for i in range(num_samples):
    plt.subplot(1, num_samples, i + 1)
    plt.imshow(images[i].squeeze(), cmap="gray")
    plt.title(f"Label: {labels[i].item()}")
    plt.axis("off")

plt.tight_layout()
plt.show()

train_loader = load_mnist(batch_size=5) visualize_samples(train_loader)

`

mnist2

MNIST digit samples

You can download full code from here

Variants of the MNIST Dataset

  1. **Fashion-MNIST: A grayscale dataset of 10 clothing and accessory categories, used as a more realistic alternative to handwritten digits.
  2. **3D MNIST: An extension of MNIST with RGB images, suitable for introducing color-based and 3D vision tasks.
  3. **EMNIST: A dataset of handwritten letters (and digits) with the same structure as MNIST, used for character recognition.
  4. **Sign Language MNIST: Images of hand gestures representing English alphabets, used for sign language and gesture recognition.
  5. **Colorectal Histology MNIST: Medical images of colorectal tissue classified into multiple cancer-related categories.
  6. **Skin Cancer MNIST: A medical dataset of skin lesion images used for skin cancer classification and diagnosis.

Applications

The MNIST dataset is widely used for education, benchmarking, and real-world digit recognition tasks.

Limitations