Data PreProcessing with Sklearn using Standard and Minmax scaler (original) (raw)

Last Updated : 3 Oct, 2025

Data preprocessing is one of the most important steps in any machine learning pipeline. Raw data often comes with different scales, units and distributions, which can lead to poor performance of models. Algorithms such as Gradient Descent methods, K-Nearest Neighbors (KNN), Linear Regression and Logistic Regression are particularly sensitive to the scale of input features. To handle this, feature scaling is applied. We will explore two of the most used scaling techniques provided by scikit-learn:

1. StandardScaler

The StandardScaler transforms data such that each feature has:

This process is called standardization (or Z-score normalization). Unlike simple rescaling, it changes the distribution of the feature so that values are measured in terms of their distance (in standard deviations) from the mean. This is particularly useful when:

**Formula:

z=\frac{x-\mu}{\sigma}

**Where:

**Example:

Python `

from sklearn.preprocessing import StandardScaler

data = [[11, 2], [3, 7], [0, 10], [11, 8]]

scaler = StandardScaler() scaled_data = scaler.fit_transform(data)

print(scaled_data)

`

**Output:

Screenshot-2025-09-22-114443

StandardScaler

Advantages

Disadvantages

2. MinMaxScaler

The MinMaxScaler rescales features to a fixed range, usually [0,1]. Unlike standardization, it does not change the distribution shape of the data; it only shifts and scales values so that the minimum feature value maps to the lower bound and the maximum maps to the upper bound.

**This is useful when:

**Formula:

First normalize to zero-one scale:

x_{std} = \frac{x - x_{\min}}{x_{\max} - x_{\min}}

Then scale to the desired feature range (min,max)(min, max)(min,max):

x_{scaled} = x_{std} \times (max - min) + min

**Where:

**Example:

Python `

from sklearn.preprocessing import MinMaxScaler data = [[11, 2], [3, 7], [0, 10], [11, 8]]

scaler = MinMaxScaler() scaled_data = scaler.fit_transform(data) print(scaled_data)

`

**Output:

Screenshot-2025-09-22-114438

MinMaxScaler

Applications

Feature scaling is used in:

Advantages

Disadvantages