SemiSupervised Learning in ML (original) (raw)

Semi-Supervised Learning in ML

Last Updated : 30 Apr, 2026

Semi-supervised learning is a distinct machine learning approach that uses a small amount of labeled data along with a large amount of unlabeled data to improve model performance. The goal is to learn a function that accurately predicts outputs based on inputs, similar to supervised learning, but with much less labelled data.

Semi-supervised-Learning

Semi-Supervised Learning

Semi-supervised learning is particularly valuable when acquiring labelled data is expensive or time-consuming, yet unlabelled data is plentiful and easy to collect.

Working of Semi-Supervised Learning

Let's see an example to understand better.

Step 1: Importing Libraries and Loading Data

We will import the necessary libraries such as numpy, matplotlib and sklearn. We will load IRIS Dataset.

Python `

import numpy as np import matplotlib.pyplot as plt from sklearn import datasets from sklearn.semi_supervised import LabelPropagation from sklearn.metrics import accuracy_score

iris = datasets.load_iris() X = iris.data[:, :2] y = iris.target

`

Step 2: Semi-Supervised Setup (Mask Labels)

We will setup the semi-supervised working,

labels = np.copy(y) rng = np.random.RandomState(42) mask = rng.rand(len(y)) < 0.1 labels[mask] = -1 print(f"Labeled: {np.sum(mask)}, Unlabeled: {np.sum(mask)}")

`

Step 3: Train a Graph-Based Model (Label Propagation)

We will train a graph-based model,

model = LabelPropagation() model.fit(X, labels)

`

Step 4: Get Transduced Labels and Evaluate

Labels are assigned to all points,

y_pred = model.transduction_ acc_labeled = accuracy_score(y[mask], y_pred[mask]) acc_overall = accuracy_score(y, y_pred) print(f"Acc (on original labeled subset): {acc_labeled:.3f}") print(f"Acc (overall after propagation): {acc_overall:.3f}")

`

**Output:

Labeled samples: 18, Unlabeled samples: 132
Accuracy on labeled data: 1.00
Overall accuracy after label propagation: 0.71

Step 5: Visualize

We will visualize results:

fig, ax = plt.subplots(1, 2, figsize=(12, 4))

ax[0].scatter(X[:, 0], X[:, 1], c='lightgray', s=30) ax[0].scatter(X[mask, 0], X[mask, 1], c=y[mask], cmap='viridis', s=60) ax[0].set_title("Before propagation — few labels")

ax[1].scatter(X[:, 0], X[:, 1], c=y_pred, cmap='viridis', s=60) ax[1].set_title("After propagation — all labeled")

plt.tight_layout() plt.show()

`

**Output:

semi-supervised

Result

As we can see in the result that the model was able to classify images into the categories or labels after successful operations of semi-supervised learning.

When to Use

Applications

Advantages

Limitations