Understanding Skewness and Kurtosis in Python (original) (raw)

Last Updated : 2 May, 2026

Understanding data distribution is essential in data analysis. Skewness helps identify whether data is symmetric or skewed, while kurtosis shows how heavy or light the tails are. In Python these measures can be computed quickly using built-in libraries.

Skewness

Skewness is a statistical measure used to describe the shape of a data distribution. It helps identify whether the distribution is symmetric or asymmetric, focusing on how data is spread around the central value rather than relying only on frequency distribution.

Types of Skewness

Distribution Based on Skewness Value

Kurtosis

Kurtosis is a statistical measure that describes how strongly a distribution is affected by extreme values (outliers). It mainly reflects the heaviness of the tails compared to a normal distribution. While kurtosis is linked with the peak shape, it does not directly measure whether the distribution is sharp or flat at the center.

Types of Kurtosis

leptokurtic

Types of Kurtosis

High Kurtosis Indicates

Low Kurtosis Indicates

How to Implement Skewness in Python

Step 1: Import Required Libraries

Here we import the necessary libraries for numerical computation, statistical analysis and visualization.

Python `

import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from scipy.stats import skew

`

Step 2: Load and Prepare the Dataset

We use the built-in diamonds dataset from Seaborn and extract the price feature for analysis.

Python `

diamonds = sns.load_dataset("diamonds") diamond_prices = diamonds["price"]

`

Step 3: Calculate Skewness Using SciPy

SciPy provides an inbuilt skew() function to compute skewness directly.

**Syntax:

scipy.stats.skew(array, axis=0, bias=True)

**Parameters:

**Return Type: Skewness value of the data set, along the axis.

Python `

skewness_scipy = skew(diamond_prices) print("Skewness using SciPy:", skewness_scipy)

`

**Output:

Skewness using SciPy: 1.6183502776053016

A positive skewness value indicates that the distribution is right-skewed, meaning a longer tail on the right side.

Step 4: Pearson’s Second Coefficient of Skewness

Pearson’s Second Coefficient of Skewness measures skewness using the relationship between the mean, median and standard deviation. If the mean is greater than the median, the skewness value becomes positive, indicating a right-skewed distribution, while a negative value indicates left skewness.

Python `

mean_price = diamond_prices.mean() median_price = diamond_prices.median() std_price = diamond_prices.std()

pearson_skewness = (3 * (mean_price - median_price)) / std_price print("Pearson's Second Skewness:", pearson_skewness)

`

**Output:

Pearson's Second Skewness: 1.1518908587086387

Step 5: Visualizing Skewness

This step visualizes the distribution of diamond prices using a KDE plot and highlights the mean, median and mode with vertical lines to understand the skewness visually.

Python `

plt.figure(figsize=(8, 5)) sns.kdeplot(diamond_prices) plt.axvline(mean_price, label="Mean") plt.axvline(median_price, color="black", label="Median") plt.axvline(diamond_prices.mode().squeeze(), color="green", label="Mode") plt.title("Distribution of Diamond Prices (Skewness)") plt.xlabel("Price") plt.legend() plt.show()

`

**Output:

skewness1

Visualization

Since the mean lies to the right of the median and mode, the distribution is positively skewed, confirming the numerical skewness results.

How to Implement Kurtosis in Python

Step 1: Import Required Library

We import the kurtosis function from SciPy, which provides an inbuilt method to calculate kurtosis.

Python `

from scipy.stats import kurtosis

`

Step 2: Calculate Kurtosis Using SciPy

This step computes kurtosis using Fisher’s definition, where a normal distribution has a kurtosis value of 0.

**Syntax:

scipy.stats.kurtosis(array, axis=0, fisher=True, bias=True)

**Parameters:

**Return Type: Kurtosis value of the normal distribution for the data set.

Python `

kurtosis_value = kurtosis(diamond_prices) print("Kurtosis using SciPy:", kurtosis_value)

`

**Output:

Kurtosis using SciPy: 2.177382669056634

A positive kurtosis value indicates a leptokurtic distribution, meaning heavy tails and a higher presence of extreme values.

Step 3: Kurtosis for Multiple Numerical Features

This step calculates kurtosis for all numeric columns to compare tail behavior across features.

Python `

diamonds.select_dtypes(include="number").kurtosis()

`

**Output:

kkk3

Output

Features with higher kurtosis values have heavier tails, indicating more outliers compared to features with lower kurtosis.

**Step 4: Visualizing Kurtosis

A KDE plot is used to observe the peak and tail behavior of the distribution.

Python `

import seaborn as sns import matplotlib.pyplot as plt

plt.figure(figsize=(8, 5)) sns.kdeplot(diamond_prices, color="red") plt.title("KDE Plot Showing Kurtosis of Diamond Prices") plt.xlabel("Price") plt.ylabel("Density") plt.show()

`

**Output:

KDEPlot

KDE Plot

The sharp peak and heavy tails in the plot indicate high kurtosis, supporting the numerical kurtosis values.

You can download full code from here