Sample Variance (original) (raw)

Last Updated : 23 Jul, 2025

In statistics, sample variance tells us how spread out the data points are from the average (mean) within a sample. Sample variance computes the mean of the squared differences of every data point with the mean. This proves to be useful if you have a small population (sample) from a greater number (population) since this reveals how diverse the data in the sample happens to be.

Mathematical Definition of Sample Variance

The formula for the sample variance is given by:

s^2 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n-1}

**Where:

Bias in Estimating Variance

When calculating variance for a sample, using n - 1 instead of n compensates for the bias that arises from using sample data to estimate the population variance. This correction, known as Bessel's correction, makes the sample variance an unbiased estimator.

Why Use n - 1 in Sample Variance?

When calculating sample variance, we divide by n -1 instead of n to account for Bessel's correction. This correction corrects the bias in the estimation of the population variance by using a sample. Dividing by n -1 gives an unbiased estimate of the population variance, ensuring that the sample variance is not underestimated.

Properties of Sample Variance

Difference Between Sample Variance and Population Variance

Feature Sample Variance Population Variance
Denominator n - 1 N
Purpose Estimate population variance from a sample Exact variance for the entire population
Bias Unbiased due to Bessel’s correction No correction needed

Sample Variance in Python

Python `

import numpy as np data = [4, 8, 6, 5, 10]

Calculating sample variance

sample_variance = np.var(data, ddof=1) print(f"Sample Variance: {sample_variance:.2f}")

`

**Output:

Sample Variance: 5.80

Limitations of Sample Variance