Implementing the AdaBoost Algorithm From Scratch (original) (raw)

Last Updated : 3 Sep, 2025

**AdaBoost means Adaptive Boosting which is a ensemble learning technique that combines multiple weak classifiers to create a strong classifier. It works by sequentially adding classifiers to correct the errors made by previous models giving more weight to the misclassified data points. Lets implement AdaBoost algorithm from scratch.

**1. Import Libraries

Let's begin with importing important libraries like numpy and scikit learn which will be required to do classification task.

Python `

import numpy as np from sklearn.tree import DecisionTreeClassifier from sklearn.datasets import load_iris from sklearn.datasets import make_classification from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score, f1_score, roc_auc_score

`

**2. Defining the AdaBoost Class

In this step we define a custom class called AdaBoost that will implement the AdaBoost algorithm from scratch. This class will handle the entire training process and predictions.

The AdaBoost class is where we define the entire AdaBoost algorithm which consists of:

class AdaBoost: def init(self, n_estimators=50): self.n_estimators = n_estimators self.alphas = [] self.models = []

`

The constructor (__init__) initializes the number of weak models ****(n_estimators)** to a list to store the alphas ****(self.alphas)** and a list to store the weak classifiers (self.models)

**3. Training the AdaBoost Model

In the fit() method we:

`

**4. Defining Predict Method

In the predict() method we combine the predictions of all weak classifiers using their respective alpha values to make the final prediction.

`

**5. Example Usage

if name == "main":

X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

adaboost = AdaBoost(n_estimators=50)
adaboost.fit(X_train, y_train)

predictions = adaboost.predict(X_test)

accuracy = accuracy_score(y_test, predictions)
precision = precision_score(y_test, predictions)
recall = recall_score(y_test, predictions)
f1 = f1_score(y_test, predictions)
try:
    roc_auc = roc_auc_score(y_test, predictions)
except ValueError:
    roc_auc = 'Undefined (requires probability scores)'

print(f"Accuracy: {accuracy * 100}%")
print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"F1 Score: {f1}")
print(f"ROC-AUC: {roc_auc}")

`

**Output:

Evaluation_metrics

Model performance

The model performs well with:

Overall these metrics indicate good performance.