Text classification using CNN (original) (raw)

Last Updated : 01 Aug, 2025

Text classification involves assigning predefined categories or labels to unstructured text documents. This supervised learning task requires training models on labeled datasets where each document has a known category.

It transforms human-readable text into numerical representations that machine learning algorithms can process. There are several preprocessing steps that significantly impact model performance.

Convolutional-Neural-Network-in-Machine-Learning

Text classification using CNN

Why use of CNN-based text classification?

CNN Architecture for Text Processing

Convolutional Neural Networks adapt to text by treating documents as sequences of words rather than spatial images. This adaptation requires modifications to traditional CNN architectures while preserving the core convolution and pooling operations.

Working-of-CNN_

Architecture for Text Processing

The embedding layer serves as the foundation, transforming discrete word tokens into continuous vector space where semantic relationships can be captured. These embeddings can be randomly initialized or pre-trained using methods like Word2Vec or GloVe.

Convolutional layers then apply multiple filters of varying sizes (typically 3, 4 and 5 words) to capture different n-gram patterns. Each filter learns to detect specific linguistic patterns that are relevant for the classification task.

Filter size considerations:

Basic Implementation Example

1. Importing Libraries

We will import the required libraries such as tensorflow, numpy required for building CNN model, creating layers, handling numerical operations and padding text sequences.

import numpy as np import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Embedding, Conv1D, GlobalMaxPooling1D, Dense, Dropout from tensorflow.keras.datasets import imdb from tensorflow.keras.preprocessing import sequence

`

2. Loading Data

We will load and preprocess the IMDB dataset.

vocab_size = 10000 max_length = 500 (x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=vocab_size) x_train = sequence.pad_sequences(x_train, maxlen=max_length) x_test = sequence.pad_sequences(x_test, maxlen=max_length)

`

3. Building CNN model

We build a CNN model that converts words into vectors, selects important features using pooling and combines them in fully connected layers. Dropout prevents overfitting and the final layer outputs a probability for classification.

model = Sequential([ Embedding(vocab_size, 100, input_length=max_length), Conv1D(filters=128, kernel_size=5, activation='relu'), GlobalMaxPooling1D(), Dense(64, activation='relu'), Dropout(0.5), Dense(1, activation='sigmoid') ])

`

4. Compiling and Training the Model

We will compile the model and train it using the IMDB dataset. Here we will use Adam optimizer with binary cross-entropy as loss function.

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) model.fit(x_train, y_train, batch_size=32, epochs=5, validation_split=0.2)

`

5. Evaluating the Model

We will evaluate the trained model on the test dataset.

test_loss, test_accuracy = model.evaluate(x_test, y_test) print(f"Test Accuracy: {test_accuracy:.4f}")

`

**Output:

Text-class_using-CNN

Accuracy using CNN

Performance Analysis

Understanding CNN performance requires monitoring key metrics:

Typical CNN performance on text classification tasks achieves 85-95% accuracy on well-defined problems like sentiment analysis, depending on dataset quality and model architecture complexity.

Real-World Applications

CNN-based text classification has found success across numerous industries:

Challenges and Best Practices

There are many challenges associated with training a CNN model. Some of which are:

**Common Challenges

**Best Practices: