Karl Pearson's Coefficient of Correlation (original) (raw)

Last Updated : 29 May, 2026

Pearson’s Correlation Coefficient is one of the most widely used statistical measures for determining the strength and direction of the relationship between two variables. Also known as the Product Moment Correlation, it measures how closely two variables move together. It is represented by r, a dimensionless value ranging from −1 to +1, where +1 indicates a perfect positive correlation, −1 indicates a perfect negative correlation and 0 represents no linear relationship between the variables.

Formula

Karl~Pearson's~Coefficient~of~Correlation =\frac{Sum~of~Products~of~Deviations~from~their~respective~means}{Number~of~Pairs\times{Standard~Deviations~of~both~Series}}

Or

r=\frac{\sum{xy}}{N\times{\sigma_x}\times{\sigma_y}}

**Where:

Example of Using Pearson’s Correlation

X 12 16 20 24 28 32 36
Y 6 9 12 15 18 21 24

Where:

\begin{aligned}\sigma_x &= \sqrt{\frac{\sum x^2}{N}} \\\sigma_x &= \sqrt{\frac{448}{7}} = \sqrt{64} = 8 \\\\\sigma_y &= \sqrt{\frac{\sum y^2}{N}} \\\sigma_y &= \sqrt{\frac{252}{7}} = \sqrt{36} = 6\end{aligned}

r = \frac{336}{7 \times 8 \times 6}~~r = \frac{336}{336}~~r = 1

The value r=1 indicates a perfect positive correlation, meaning both variables increase proportionally together.

Methods of Calculating Karl Pearson's Coefficient of Correlation

  1. Actual Mean Method
  2. Direct Method
  3. Short-Cut Method/Assumed Mean Method/Indirect Method
  4. Step-Deviation Method

1. Actual Mean Method

This method calculates correlation using deviations from the actual means of both series.

**Formula:

r=\frac{\sum{xy}}{\sqrt{\sum{x^2}\times{\sum{y^2}}}}

2. Direct Method

The Direct Method calculates correlation using the original values of the series without finding deviations separately.

**Formula:

r=\frac{N\sum{XY}-\sum{X}.\sum{Y}}{\sqrt{N\sum{X^2}-(\sum{X})^2}{\sqrt{N\sum{Y^2}-(\sum{Y})^2}}}

3. Short-Cut Method/Assumed Mean Method

This method simplifies calculations by taking deviations from assumed means instead of actual means.

**Formula:

r=\frac{N\sum{dxdy}-\sum{dx}.\sum{dy}}{\sqrt{N\sum{dx^2}-(\sum{dx})^2}{\sqrt{N\sum{dy^2}-(\sum{dy})^2}}}

4. Step Deviation Method

The Step Deviation Method further simplifies calculations by taking deviations from an assumed mean and dividing them by a common factor CCC.

**Formula:

r=\frac{N\sum{dx^\prime{dy^\prime}}-\sum{dx^\prime}.\sum{dy^\prime}}{\sqrt{N\sum{dx^\prime{^2}}-(\sum{dx^\prime})^2}{\sqrt{N\sum{dy^\prime{^2}}-(\sum{dy^\prime})^2}}}

**Python Implementation

import numpy as np

Sample data

X = np.array([12, 16, 20, 24, 28, 32, 36]) Y = np.array([6, 9, 12, 15, 18, 21, 24])

Mean of X and Y

mean_x = np.mean(X) mean_y = np.mean(Y)

Deviations from mean

x = X - mean_x y = Y - mean_y

Standard deviations

sigma_x = np.sqrt(np.sum(x2) / len(X)) sigma_y = np.sqrt(np.sum(y2) / len(Y))

Pearson correlation coefficient

r = np.sum(x * y) / (len(X) * sigma_x * sigma_y)

print("Pearson Correlation Coefficient:", r)

`

**Output:

Pearson Correlation Coefficient: 1.0