Applying Convolutional Neural Network on mnist dataset (original) (raw)

Last Updated : 12 Aug, 2024

**CNN is a model known to be a **Convolutional Neural Network and in recent times it has gained a lot of popularity because of its usefulness. CNN uses multilayer perceptrons to do computational work. CNN uses relatively little pre-processing compared to other image classification algorithms. This means the network learns through filters that in traditional algorithms were hand-engineered. So, for the image processing tasks, CNNs are the best-suited option.

Applying a Convolutional Neural Network (CNN) on the MNIST dataset is a popular way to learn about and demonstrate the capabilities of CNNs for image classification tasks. The MNIST dataset consists of 28x28 grayscale images of hand-written digits (0-9), with a training set of 60,000 examples and a test set of 10,000 examples.

Here is a basic approach to applying a CNN on the MNIST dataset using the Python programming language and the Keras library:

  1. Load and preprocess the data: The MNIST dataset can be loaded using the Keras library, and the images can be normalized to have pixel values between 0 and 1.
  2. Define the model architecture: The CNN can be constructed using the Keras Sequential API, which allows for easy building of sequential models layer-by-layer. The architecture should typically include convolutional layers, pooling layers, and fully-connected layers.
  3. Compile the model: The model needs to be compiled with a loss function, an optimizer, and a metric for evaluation.
  4. Train the model: The model can be trained on the training set using the Keras fit() function. It is important to monitor the training accuracy and loss to ensure the model is converging properly.
  5. Evaluate the model: The trained model can be evaluated on the test set using the Keras evaluate() function. The evaluation metric typically used for classification tasks is accuracy.

Here are some tips and best practices to keep in mind when applying a CNN on the MNIST dataset:

  1. Start with a simple architecture and gradually increase complexity if necessary.
  2. Experiment with different activation functions, optimizers, learning rates, and batch sizes to find the optimal combination for your specific task.
  3. Use regularization techniques such as dropout or weight decay to prevent overfitting.
  4. Visualize the filters and feature maps learned by the model to gain insights into its inner workings.
  5. Compare the performance of the CNN to other machine learning algorithms such as Support Vector Machines or Random Forests to get a sense of its relative performance.

References:

  1. MNIST dataset: http://yann.lecun.com/exdb/mnist/
  2. Keras documentation: https://keras.io/
  3. "Deep Learning with Python" by Francois Chollet (https://www.manning.com/books/deep-learning-with-python)

**MNIST dataset:
mnist dataset is a dataset of handwritten images as shown below in the image.

We can get 99.06% accuracy by using CNN(Convolutional Neural Network) with a functional model. The reason for using a functional model is to maintain easiness while connecting the layers.

Firstly, include all necessary libraries

Python3 `

import numpy as np import keras from keras.datasets import mnist from keras.models import Model from keras.layers import Dense, Input from keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten from keras import backend as k

`

Create the train data and test data

(x_train, y_train), (x_test, y_test) = mnist.load_data()

`

img_rows, img_cols=28, 28

if k.image_data_format() == 'channels_first': x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols) x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols) inpx = (1, img_rows, img_cols)

else: x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1) x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1) inpx = (img_rows, img_cols, 1)

x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255

`

Description of the output classes:

y_train = keras.utils.to_categorical(y_train) y_test = keras.utils.to_categorical(y_test)

`

inpx = Input(shape=inpx) layer1 = Conv2D(32, kernel_size=(3, 3), activation='relu')(inpx) layer2 = Conv2D(64, (3, 3), activation='relu')(layer1) layer3 = MaxPooling2D(pool_size=(3, 3))(layer2) layer4 = Dropout(0.5)(layer3) layer5 = Flatten()(layer4) layer6 = Dense(250, activation='sigmoid')(layer5) layer7 = Dense(10, activation='softmax')(layer6)

`

model = Model([inpx], layer7) model.compile(optimizer=keras.optimizers.Adadelta(), loss=keras.losses.categorical_crossentropy, metrics=['accuracy'])

model.fit(x_train, y_train, epochs=12, batch_size=500)

`

**Output:

Epoch 1/12
120/120 ━━━━━━━━━━━━━━━━━━━━ 154s 1s/step - accuracy: 0.0968 - loss: 2.4955
Epoch 2/12
120/120 ━━━━━━━━━━━━━━━━━━━━ 201s 1s/step - accuracy: 0.0957 - loss: 2.4752
Epoch 3/12
120/120 ━━━━━━━━━━━━━━━━━━━━ 203s 1s/step - accuracy: 0.0995 - loss: 2.4479
Epoch 4/12
120/120 ━━━━━━━━━━━━━━━━━━━━ 151s 1s/step - accuracy: 0.0984 - loss: 2.4262
Epoch 5/12
120/120 ━━━━━━━━━━━━━━━━━━━━ 204s 1s/step - accuracy: 0.0980 - loss: 2.4085
Epoch 6/12
120/120 ━━━━━━━━━━━━━━━━━━━━ 201s 1s/step - accuracy: 0.0970 - loss: 2.3864
Epoch 7/12
120/120 ━━━━━━━━━━━━━━━━━━━━ 202s 1s/step - accuracy: 0.0982 - loss: 2.3699
Epoch 8/12
120/120 ━━━━━━━━━━━━━━━━━━━━ 152s 1s/step - accuracy: 0.0972 - loss: 2.3520
Epoch 9/12
120/120 ━━━━━━━━━━━━━━━━━━━━ 201s 1s/step - accuracy: 0.0975 - loss: 2.3324
Epoch 10/12
120/120 ━━━━━━━━━━━━━━━━━━━━ 203s 1s/step - accuracy: 0.0966 - loss: 2.3151
Epoch 11/12
120/120 ━━━━━━━━━━━━━━━━━━━━ 153s 1s/step - accuracy: 0.0960 - loss: 2.3017
Epoch 12/12
120/120 ━━━━━━━━━━━━━━━━━━━━ 201s 1s/step - accuracy: 0.0972 - loss: 2.2847
<keras.src.callbacks.history.History at 0x7b04c491e200>

score = model.evaluate(x_test, y_test, verbose=0) print('loss=', score[0]) print('accuracy=', score[1])

`

**Output:

loss= 2.269895553588867
accuracy= 0.09950000047683716