Covariance and Correlation (original) (raw)

Last Updated : 8 Apr, 2026

Covariance and correlation are the two key concepts in Statistics that help us analyze the relationship between two variables. Covariance measures how two variables change together, indicating whether they move in the same or opposite directions.

independent_variables

Relationship between Independent and dependent variables

To understand this relationship better, consider factors like sunlight, water and soil nutrients (as shown in the image), which are independent variables that influence plant growth, which is our dependent variable.

What is Covariance

Covariance measures how two random variables change together. It is calculated by averaging the product of their deviations from their means. A positive value means they move in the same direction, while a negative value means they move in opposite directions.

  1. It can take any value between - infinity to +infinity, where the negative value represents the negative relationship whereas a positive value represents the positive relationship.
  2. it indicates the direction of a linear relationship, but it does not measure its strength in a standardized way.”
  3. It gives the direction of relationship between variables.

Covariance Formula

**1. Sample Covariance

\text{Cov}_S(X, Y) = \frac{1}{n - 1} \sum_{i=1}^{n} (X_i - \overline{X})(Y_i - \overline{Y})

**Where:

**2. Population Covariance

\text{Cov}_P(X, Y) = \frac{1}{n} \sum_{i=1}^{n} (X_i - \mu_X)(Y_i - \mu_Y)

**Where:

Types of Covariance

Example

covariance

Covariance

What is Correlation

Correlation is a standardized measure of the strength and direction of the linear relationship between two variables. It is derived from covariance and ranges between -1 and 1. Unlike covariance, which only indicates the direction of the relationship, correlation provides a standardized measure.

The correlation coefficient \rho for variables X and Y is defined as:

  1. Correlation takes values between -1 to +1, wherein values close to +1 represents strong positive correlation and values close to -1 represents strong negative correlation.
  2. The variables may be negatively related (i.e., move in opposite directions).
  3. It gives the direction and strength of relationship between variables.

Correlation Formula

\text{Corr}(x, y) =\frac{\sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n} (x_i - \bar{x})^2 \; \sum_{i=1}^{n} (y_i - \bar{y})^2}}

Here,

Example

Correlation

Correlation

Covariance vs. Correlation

Covariance Correlation
Covariance is a measure of how much two random variables vary together Correlation is a statistical measure that indicates how strongly two variables are related.
Involves the relationship between two variables or data sets Involves the relationship between multiple variables as well Correlation (specifically Pearson correlation) measures the relationship between two variables.
Lie between -infinity and +infinity Lie between -1 and +1
Measure of correlation Scaled version of covariance
Provides direction of relationship Provides direction and strength of relationship
Dependent on scale of variable Independent on scale of variable
Have dimensions Dimensionless

They key difference is that Covariance shows the direction of the relationship between variables, while correlation shows both the direction and strength in a standardized form.

Applications of Covariance

Applications of Correlation