Kurtosis in R Programming (original) (raw)

Last Updated : 11 Jul, 2025

Kurtosis measures a distribution's "tailedness," not its peakedness, compared to a normal distribution. A common misconception is that kurtosis indicates how tall or flat the peak of a distribution is. However, kurtosis measures the weight of the tails of the distribution and provides information about the presence of extreme outliers.

Mathematical Formula for Kurtosis

There are 3 types of kurtosis are available so we will discuss all of them.

1. Population Kurtosis

The formula for population kurtosis is defined as:

K = \frac{\mu_4}{\sigma_4}

**where:

This formula can also be expressed as:

K = \frac{\Epsilon[(X- \mu)^4]}{(E[X-\mu]^2)^2}

**where:

2. Sample Kurtosis

For a sample, the kurtosis formula adjusts for sample size and is given by:

K = \frac{n(n-1)}{(n-2)(n-3)}. \frac{\sum(x_i - \bar{x})^4}{s^4}

**where:

This formula accounts for bias in estimating the population kurtosis from a sample.

3. Excess Kurtosis

Excess kurtosis is often used to compare the kurtosis of a distribution to that of a normal distribution (which has a kurtosis of 3). It is calculated as:

\text{Excess Kurtosis } = K -3

This adjustment allows for easier interpretation, where a value of 0 indicates a distribution similar to normal.

Why Kurtosis is Important?

Kurtosis is essential in many statistical applications because:

  1. **Identifying outliers: It helps identify whether our data contains extreme values or outliers, which might need special handling.
  2. **Financial analysis: In finance, kurtosis is used to model the risk of extreme price movements, making it a valuable metric for portfolio management.
  3. **Assumption checking: In many statistical models, normality assumptions are crucial. Kurtosis helps verify the extent of deviation from normality.

Calculating Kurtosis in R

In R programming, we can calculate kurtosis using several libraries.

1. Using e1071 Package

We calculate kurtosis using the kurtosis() function from the e1071 package and visualize the distribution using ggplot2. This helps us understand the shape and outliers in our dataset compared to a normal distribution.

install.packages("e1071") install.packages("ggplot2")

library(e1071) library(ggplot2)

data <- c(2, 3, 4, 4, 4, 5, 6, 7, 8, 9, 10)

kurt_val <- kurtosis(data) cat("The kurtosis of the dataset is:", kurt_val, "\n")

df <- data.frame(value = data)

ggplot(df, aes(x = value)) + geom_histogram(aes(y = ..density..), binwidth = 1, fill = "lightblue", color = "black") + geom_density(color = "red", size = 1) + stat_function(fun = dnorm, args = list(mean = mean(data), sd = sd(data)), color = "blue", linetype = "dashed")

`

**Output:

The kurtosis of the dataset is: -1.561636

e1071

The calculated kurtosis is -1.56, which indicates the distribution is platykurtic. Platykurtic distributions (kurtosis < 3 or in this case < 0) have thinner tails and flatter peaks compared to a normal distribution. The negative kurtosis suggests our dataset is flatter and less prone to producing outliers compared to a normal distribution.

2. Using Moments Package

We use the kurtosis() function from the moments package to calculate kurtosis and visualize it using ggplot2 for interpretation.

install.packages("moments") install.packages("ggplot2")

library(moments) library(ggplot2)

data <- c(4, 5, 5, 6, 6, 6, 7, 8, 10, 12, 13)

kurt_val <- kurtosis(data) cat("The kurtosis of the dataset is:", kurt_val, "\n")

df <- data.frame(value = data)

ggplot(df, aes(x = value)) + geom_histogram(aes(y = ..density..), binwidth = 1, fill = "lightblue", color = "black") + geom_density(color = "red", size = 1) + stat_function(fun = dnorm, args = list(mean = mean(data), sd = sd(data)), color = "blue", linetype = "dashed")

`

**Output:

The kurtosis of the dataset is: 1.775758

moments

Histogram and Density Plot

A kurtosis of 1.775758 indicates that the distribution is platykurtic. This means that our dataset has lighter tails and a flatter peak compared to a normal distribution. In practical terms, this could imply fewer extreme values (outliers) and a more evenly distributed set of data points.

Applications of Kurtosis

Here are the main Applications of Kurtosis:

  1. **Risk Management: In finance, high kurtosis might suggest a higher risk of extreme financial losses due to the presence of outliers.
  2. **Quality Control: Manufacturing processes often use kurtosis to monitor the consistency of products. High kurtosis indicates a higher likelihood of defective products.
  3. **Machine Learning: In the preprocessing phase of machine learning, kurtosis can help detect anomalies in the dataset.