Ensemble Methods in Python (original) (raw)

Last Updated : 23 Mar, 2026

Ensemble methods in Python are machine learning techniques that combine multiple models to improve overall performance and accuracy. By aggregating predictions from different algorithms, ensemble methods help reduce errors, handle variance and produce more robust models.

Architecture of Ensemble Models

The architecture of ensemble learning defines how multiple models are organized, trained and combined to generate a final prediction. Instead of relying on a single algorithm ensemble architecture introduces multiple learning layers that work together to improve predictive performance, stability and generalization.

1. Base Learners

Base learners form the first layer of the ensemble system. These are individual machine learning models trained on the original dataset.

Diversity among base learners is important because combining similar models may not significantly improve performance.

2. Meta Learner

The meta learner operates at the second level of the architecture and is responsible for combining predictions from base learners.

Two-level structure is commonly used in stacking, while other ensemble methods like bagging and boosting modify how base learners are trained and aggregated.

Types of Ensemble Methods

Ensemble methods combine multiple models in different ways to improve predictive performance. Understanding the main types helps choose the right strategy for your specific problem and dataset.

1. Max Voting

Max voting, also known as majority voting, is a ensemble technique primarily used for classification problems. In this method, multiple models make independent predictions and the class that receives the highest number of votes is selected as the final output. It improves prediction stability by combining the strengths of different classifiers.

**Step By Step Implementation

Here we implement Hard voting and Soft Voting

**Step 1: Load and Preprocess Data

Load the dataset and split it into features (X) and target (y). Then we perform a train-test split and scale the features for better model performance.

You can download dataset from here

Python `

import pandas as pd from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler

df = pd.read_csv("dataset Path")

X = df.drop('target', axis=1) y = df['target']

X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42, stratify=y )

scaler = StandardScaler() X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test)

`

**Step 2: Initialize Base Classifiers

Here we define the individual models that will form the ensemble Logistic Regression, Decision Tree, Random Forest and XGBoost.

Python `

from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.ensemble import RandomForestClassifier from xgboost import XGBClassifier

Initialize base classifiers

log_reg = LogisticRegression(max_iter=300, random_state=42) dt_clf = DecisionTreeClassifier(random_state=42) rf_clf = RandomForestClassifier(n_estimators=100, random_state=42) xgb_clf = XGBClassifier(use_label_encoder=False, eval_metric='logloss', random_state=42)

`

**Step 3: Train Voting Classifier

We create a hard and soft voting classifier, train them on the training data and make predictions on the test set.

Python `

from sklearn.ensemble import VotingClassifier from sklearn.metrics import accuracy_score

Hard Voting

hard_voting = VotingClassifier( estimators=[('lr', log_reg), ('dt', dt_clf), ('rf', rf_clf), ('xgb', xgb_clf)], voting='hard' ) hard_voting.fit(X_train, y_train) y_pred_hard = hard_voting.predict(X_test) print("Hard Voting Accuracy:", accuracy_score(y_test, y_pred_hard))

Soft Voting

soft_voting = VotingClassifier( estimators=[('lr', log_reg), ('dt', dt_clf), ('rf', rf_clf), ('xgb', xgb_clf)], voting='soft' ) soft_voting.fit(X_train, y_train) y_pred_soft = soft_voting.predict(X_test) print("Soft Voting Accuracy:", accuracy_score(y_test, y_pred_soft))

`

**Output:

Hard Voting Accuracy: 1.0
Soft Voting Accuracy: 1.0

2. Averaging Method

The averaging method is an ensemble technique mainly used for regression problems. Multiple models are trained independently and their predictions are averaged to produce the final output. By combining multiple predictions, variance is reduced and the ensemble generally performs better than individual models.

**Implementation

Here we builds an averaging ensemble regression model using the Boston Housing Dataset to improve prediction accuracy.

import numpy as np import pandas as pd import matplotlib.pyplot as plt

from sklearn.datasets import fetch_openml from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression from sklearn.tree import DecisionTreeRegressor from sklearn.ensemble import RandomForestRegressor from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

boston = fetch_openml(name="boston", version=1, as_frame=True)

X = boston.data.apply(pd.to_numeric) y = pd.to_numeric(boston.target)

X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42 )

model1 = LinearRegression() model2 = DecisionTreeRegressor(random_state=42) model3 = RandomForestRegressor(n_estimators=100, random_state=42)

model1.fit(X_train, y_train) model2.fit(X_train, y_train) model3.fit(X_train, y_train)

pred1 = model1.predict(X_test) pred2 = model2.predict(X_test) pred3 = model3.predict(X_test)

y_pred = (pred1 + pred2 + pred3) / 3

r2 = r2_score(y_test, y_pred)

print("R2 Score :", r2)

`

**Output:

R2 Score : 0.8872852109557785

3. Bagging (Bootstrap Aggregation)

Bagging improves model stability and accuracy by training multiple models on different random subsets of the dataset and aggregating their predictions. Unlike Random Forest, which randomly selects a subset of features at each split, bagging uses all features for each base model. Bagging is especially effective in reducing variance and preventing overfitting.

**Implementation

Here we implement Bagging ensemble technique using Decision Trees on the Iris dataset for classification.

import pandas as pd from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score from sklearn.ensemble import BaggingClassifier from sklearn.tree import DecisionTreeClassifier from sklearn.datasets import load_iris

iris = load_iris() X = iris.data y = iris.target

X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42 )

bagging_model = BaggingClassifier( base_estimator=DecisionTreeClassifier(random_state=42), n_estimators=10,
random_state=42 )

bagging_model.fit(X_train, y_train)

pred_final = bagging_model.predict(X_test)

accuracy = accuracy_score(y_test, pred_final) print("Accuracy (Bagging on Iris):", accuracy)

`

**Output:

Accuracy (Bagging on Iris): 1.0

4. Boosting

Boosting is a sequential ensemble method designed to convert a set of weak learners into a strong learner. Each new model is trained to correct the errors made by its predecessor and the final prediction is formed by a weighted combination of all models. Boosting is highly effective in reducing bias and improving predictive accuracy.

Unlike bagging, boosting trains models sequentially which allows each successive model to focus more on the difficult cases that previous models mispredicted. This makes it particularly powerful for datasets where simple models underperform.

**Implementation

Here we implement Gradient Boosting ensemble method for regression using a heart disease dataset.

import pandas as pd from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error from sklearn.ensemble import GradientBoostingRegressor

df = pd.read_csv("Your dataset")

X = df.drop("target", axis=1) y = df["target"] X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42 )

boosting_model = GradientBoostingRegressor( n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42 )

boosting_model.fit(X_train, y_train) pred_final = boosting_model.predict(X_test) mse = mean_squared_error(y_test, pred_final) print("Mean Squared Error (Boosting):", mse)

`

**Output:

Mean Squared Error (Boosting): 0.07407866489977881

5. Stacking Ensemble Method

Stacking combines predictions from multiple base models to train a meta-learner, which produces the final predictions. Unlike bagging and boosting that usually use homogeneous base learners, stacking often uses heterogeneous models to capture diverse patterns in the data. It can be used for both classification and regression problems.

**Implementation

Here we builds a stacking ensemble regression model using multiple base learners and a meta-learner to improve prediction accuracy.

import pandas as pd from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error from sklearn.linear_model import LinearRegression from sklearn.ensemble import RandomForestRegressor import xgboost as xgb from vecstack import stacking df = pd.read_csv("Your dataset") X = df.drop("target", axis=1) y = df["target"]

X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42 )

model_1 = LinearRegression() model_2 = xgb.XGBRegressor(eval_metric='rmse', random_state=42) model_3 = RandomForestRegressor(n_estimators=100, random_state=42)

all_models = [model_1, model_2, model_3]

s_train, s_test = stacking( all_models, X_train, y_train, X_test,
regression=True, n_folds=4, shuffle=True, random_state=42 )

meta_model = LinearRegression() meta_model.fit(s_train, y_train)

pred_final = meta_model.predict(s_test)

mse = mean_squared_error(y_test, pred_final) print("Mean Squared Error (Stacking):", mse)

`

**Output:

Mean Squared Error (Stacking): 0.020857985206334067

6. Blending Ensemble Method

Blending is similar to stacking, but instead of using the whole training dataset for base models a separate validation dataset is kept aside. Base models are trained on the training set and their predictions on the validation set are used as meta-features to train a second-level model (meta-learner). This separation helps reduce overfitting and improves generalization.

**Implementation

Here we implements a Blending ensemble regression technique to improve prediction accuracy using a meta-model.

import pandas as pd from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error from sklearn.linear_model import LinearRegression from sklearn.ensemble import RandomForestRegressor import xgboost as xgb

df = pd.read_csv("Your dataset") X = df.drop("target", axis=1) y = df["target"]

X_train_full, X_test, y_train_full, y_test = train_test_split(X, y, test_size=0.10, random_state=42) X_train, X_val, y_train, y_val = train_test_split(X_train_full, y_train_full, test_size=0.2222, random_state=42)

model_1 = LinearRegression() model_2 = xgb.XGBRegressor(eval_metric='rmse', random_state=42) model_3 = RandomForestRegressor(n_estimators=100, random_state=42) base_models = [model_1, model_2, model_3]

val_preds = [] test_preds = []

for model in base_models: model.fit(X_train, y_train) val_preds.append(pd.DataFrame(model.predict(X_val))) test_preds.append(pd.DataFrame(model.predict(X_test)))

meta_X_val = pd.concat(val_preds, axis=1) meta_X_test = pd.concat(test_preds, axis=1)

meta_model = LinearRegression() meta_model.fit(meta_X_val, y_val) final_pred = meta_model.predict(meta_X_test)

mse = mean_squared_error(y_test, final_pred) print("Mean Squared Error (Blending):", mse)

`

**Output:

Mean Squared Error (Blending): 0.027088923263424304

You can download full code from here