Population Stability Index (PSI) (original) (raw)

Last Updated : 13 Aug, 2025

The Population Stability Index (PSI) is a statistical tool used to measure how much the distribution of data changes between two datasets, usually between a training dataset and a new dataset. The PSI is widely used in credit scoring, risk management and financial modeling to ensure that models remain valid and stable. A high PSI score indicates a significant change in population distribution, while a low PSI score indicates stability.

Key Terms in PSI

Importance of PSI

Formula for PSI

The formula for calculating the Population Stability Index (PSI) is as follows:

\text{PSI} = \sum \left( (\text{Observed} - \text{Expected}) \times \ln\left(\frac{\text{Observed}}{\text{Expected}}\right) \right)

Where:

A higher PSI indicates a greater shift between distributions, signaling potential issues like data drift.

Steps to calculate PSI

**1. Define the Bins: First divide the data into bins based on the variable we are analyzing. For example, if we are working with age data, we could define bins for ranges such as 18-25, 26-35 and so on.

**2. Calculate the Observed and Expected Proportions:

**3. Apply the PSI Formula: Use the formula to calculate the PSI for each bin. Multiply the difference between the observed and expected proportions by the natural logarithm of the ratio of observed to expected proportions. Sum the results for all bins.

**4. Interpret the PSI: The PSI score helps determine whether there has been a shift in the population.

Let's understand the calculation of PSI with an example where we analyze the distribution of income levels in two different years 2020 ad 2021,

**Step 1: Define Bias: Lets assume we have 3 income bins,

**Step 2: Calculate Observed and Expected Proportions

Income Range Expected Distribution (2020) Observed Distribution (2021)
**Low Income 0.4 0.35
**Medium Income 0.3 0.25
**High Income 0.3 0.4

**Step 3: Apply the PSI formula

\left( 0.35 - 0.4 \right) \times \ln\left(\frac{0.35}{0.4}\right) = -0.05 \times \ln(0.875) = -0.05 \times (-0.1335) = 0.006675

\left( 0.25 - 0.3 \right) \times \ln\left(\frac{0.25}{0.3}\right) = -0.05 \times \ln(0.8333) = -0.05 \times (-0.1823) = 0.009115

\left( 0.4 - 0.3 \right) \times \ln\left(\frac{0.4}{0.3}\right) = 0.1 \times \ln(1.3333) = 0.1 \times 0.2877 = 0.02877

**Step 4: Sum the Results

PSI=0.006675+0.009115+0.02877=0.04456

PSI-Example

Graphical Representation of PSI

**Step 5: Interpret the PSI: Since the PSI value is 0.04456 which is less than 0.1, we can conclude that there is no significant shift in the population between 2020 and 2021. The model is likely to remain stable and valid.

Applications of PSI