SMS Spam Detection using TensorFlow in Python (original) (raw)

Last Updated : 11 Dec, 2025

SMS spam detection is automatically classifying incoming text messages as either ham (legitimate) or spam using machine learning and deep learning models, so that harmful, fraudulent or unwanted messages can be filtered before reaching the user. In this project, we will:

Implementation

Step 1: Import Libraries

We will import all the required libraries like numpy, pandas, matplotlib, seaborn, scikit-learn and TensorFlow.

Python `

import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score import tensorflow_hub as hub

`

Step 2: Load the Dataset

We will load the dataset.

The used dataset can be downloaded from here.

Python `

df = pd.read_csv("/content/spam.csv", encoding='latin-1') df.head()

`

**Output:

Screenshot-2025-11-27-124419

Result

Step 3: Clean the dataset and encode labels

We will clean the dataset:

df = df.drop(['Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4'], axis=1) df = df.rename(columns={'v1': 'label', 'v2': 'Text'}) df['label_enc'] = df['label'].map({'ham': 0, 'spam': 1}) df.head()

`

**Output:

Screenshot-2025-11-27-124429

Result

Step 4: Split Data and convert to NumPy arrays

Here,

X_train, X_test, y_train, y_test = train_test_split( df['Text'], df['label_enc'], test_size=0.2, random_state=42 )

X_train_np = X_train.to_numpy() X_test_np = X_test.to_numpy() y_train_np = y_train.to_numpy() y_test_np = y_test.to_numpy()

`

Step 5: Compute text Statistics for Vectorization

avg_words_len = round(sum([len(i.split()) for i in df['Text']]) / len(df['Text'])) total_words_length = len(set(" ".join(df['Text']).split()))

print(f"Data Loaded. Training samples: {len(X_train_np)}") print(f"Average words per message: {avg_words_len}") print(f"Approximate vocabulary size: {total_words_length}")

`

**Output:

Screenshot-2025-11-27-125818

Result

Step 6: Helper functions for training and evaluation

def compile_and_fit(model, epochs=5): model.compile( optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'] ) history = model.fit( X_train_np, y_train_np, epochs=epochs, validation_data=(X_test_np, y_test_np) ) return history

def get_metrics(model, X, y): y_preds = np.round(model.predict(X)) return { 'accuracy': accuracy_score(y, y_preds), 'precision': precision_score(y, y_preds), 'recall': recall_score(y, y_preds), 'f1-score': f1_score(y, y_preds) }

`

Step 7: Create the TextVectorization layer

from tensorflow.keras.layers import TextVectorization text_vec = TextVectorization( max_tokens=total_words_length, standardize='lower_and_strip_punctuation', output_mode='int', output_sequence_length=avg_words_len ) text_vec.adapt(X_train_np)

`

Step 8: Model 1 – Dense embedding model (build and train)

input_layer = layers.Input(shape=(1,), dtype=tf.string) x = text_vec(input_layer) x = layers.Embedding(input_dim=total_words_length, output_dim=128)(x) x = layers.GlobalAveragePooling1D()(x) x = layers.Dense(32, activation='relu')(x) output_layer = layers.Dense(1, activation='sigmoid')(x)

model_1 = keras.Model(input_layer, output_layer, name="Dense_Model") history_1 = compile_and_fit(model_1)

`

**Output:

Screenshot-2025-11-27-124656

Result

Step 9: Model 2 – Bidirectional LSTM model (build and train)

input_layer = layers.Input(shape=(1,), dtype=tf.string) x = text_vec(input_layer) x = layers.Embedding(input_dim=total_words_length, output_dim=128)(x) x = layers.Bidirectional(layers.LSTM(64, return_sequences=True))(x) x = layers.Bidirectional(layers.LSTM(64))(x) x = layers.Flatten()(x) x = layers.Dropout(0.1)(x) x = layers.Dense(32, activation='relu')(x) output_layer = layers.Dense(1, activation='sigmoid')(x)

model_2 = keras.Model(input_layer, output_layer, name="BiLSTM_Model") history_2 = compile_and_fit(model_2)

`

**Output:

Screenshot-2025-11-27-124731

Result

Step 10: Model 3 – Transfer learning with Universal Sentence Encoder (build and train)

use_layer = hub.KerasLayer( "https://tfhub.dev/google/universal-sentence-encoder/4", trainable=False, input_shape=[], dtype=tf.string, name='USE' ) input_layer = layers.Input(shape=[], dtype=tf.string) embedding = layers.Lambda(lambda x: use_layer( x), output_shape=(512,))(input_layer) x = layers.Dense(64, activation='relu')(embedding) x = layers.Dropout(0.2)(x) output_layer = layers.Dense(1, activation='sigmoid')(x) model_3 = keras.Model(input_layer, output_layer, name="USE_Model") history_3 = compile_and_fit(model_3)

`

**Output:

Screenshot-2025-11-27-124745

Result

Step 11: Collect performance metrics for all models

results = { 'Dense Embedding': get_metrics(model_1, X_test_np, y_test_np), 'Bi-LSTM': get_metrics(model_2, X_test_np, y_test_np), 'Transfer Learning (USE)': get_metrics(model_3, X_test_np, y_test_np) }

results_df = pd.DataFrame(results).transpose() print("Performance Table:") print(results_df)

`

**Output:

Screenshot-2025-11-27-124829

Result

The table compares the performance of all three models on the same dataset:

Step 12: Visualize

We will visualize the results,

**a. Bar Chart

Python `

results_df.plot(kind='bar', figsize=(10, 6)) plt.title("Model Performance Metrics (Bar Chart)") plt.ylabel("Score") plt.ylim(0.8, 1.0) plt.xticks(rotation=0) plt.legend(loc='lower right') plt.grid(axis='y', linestyle='--', alpha=0.7) plt.show()

`

**Output:

mb

Result

**b. Line Graph

Python `

plt.figure(figsize=(10, 6))

for model_name in results_df.index: plt.plot( results_df.columns, results_df.loc[model_name], marker='o', label=model_name, linewidth=2 ) plt.title("Model Performance Trends (Line Graph)") plt.ylabel("Score") plt.xlabel("Metric") plt.ylim(0.8, 1.0) plt.grid(True, linestyle='--', alpha=0.6) plt.legend() plt.show()

`

**Output:

download

Result

You can download source code from here.