Program to Find Correlation Coefficient (original) (raw)

Last Updated : 21 Mar, 2025

The **correlation coefficient is a statistical measure that helps determine the strength and direction of the relationship between two variables. It quantifies how changes in one variable correspond to changes in another. This coefficient, sometimes referred to as the **cross-correlation coefficient, always lies between **-1 and +1:

**Formula for Correlation Coefficient

The correlation coefficient (**r) is calculated using the formula:

r=\frac{n\left(\sum x y\right)-\left(\sum x\right)\left(\sum y\right)}{\sqrt{\left[n \sum x^{2}-\left(\sum x\right)^{2}\right]\left[n \Sigma y^{2}-\left(\sum y\right)^{2}\right]}}

Where:

**Example Calculation

Let's calculate the correlation coefficient for the given dataset:

X Y
15 25
18 25
21 27
24 31
27 32
ΣX = 105 ΣY = 140

Additional calculations:

X × Y
375 225 625
450 324 625
567 441 729
744 576 961
864 729 1024
Σ(X × Y) = 3000 ΣX² = 2295 ΣY² = 3964

Now, applying the formula:

**Example Inputs & Outputs

**Example 1

**Input:
X = {43, 21, 25, 42, 57, 59}
Y = {99, 65, 79, 75, 87, 81}

**Output:
r = **0.529809

**Example 2

**Input:
X = {15, 18, 21, 24, 27}
Y = {25, 25, 27, 31, 32}

**Output:
r = **0.953463

Program to Computing the Correlation Coefficient in Python

C++ `

// Program to find correlation coefficient #include<bits/stdc++.h> using namespace std;

// function that returns correlation coefficient. float correlationCoefficient(int X[], int Y[], int n) { int sum_X = 0, sum_Y = 0, sum_XY = 0; int squareSum_X = 0, squareSum_Y = 0;

for (int i = 0; i < n; i++)
{
    // sum of elements of array X.
    sum_X = sum_X + X[i];

    // sum of elements of array Y.
    sum_Y = sum_Y + Y[i];

    // sum of X[i] * Y[i].
    sum_XY = sum_XY + X[i] * Y[i];

    // sum of square of array elements.
    squareSum_X = squareSum_X + X[i] * X[i];
    squareSum_Y = squareSum_Y + Y[i] * Y[i];
}
// use formula for calculating correlation coefficient.
float corr = (float)(n * sum_XY - sum_X * sum_Y) 
              / sqrt((n * squareSum_X - sum_X * sum_X) 
                  * (n * squareSum_Y - sum_Y * sum_Y));
return corr;

} // Driver function int main() { int X[] = {15, 18, 21, 24, 27}; int Y[] = {25, 25, 27, 31, 32}; //Find the size of array. int n = sizeof(X)/sizeof(X[0]); //Function call to correlationCoefficient. cout<<correlationCoefficient(X, Y, n); return 0; }

Java

// JAVA Program to find correlation coefficient import java.math.*; class GFG { // function that returns correlation coefficient. static float correlationCoefficient(int X[], int Y[], int n) { int sum_X = 0, sum_Y = 0, sum_XY = 0; int squareSum_X = 0, squareSum_Y = 0;

    for (int i = 0; i < n; i++)
    {
        // sum of elements of array X.
        sum_X = sum_X + X[i];
 
        // sum of elements of array Y.
        sum_Y = sum_Y + Y[i];
 
        // sum of X[i] * Y[i].
        sum_XY = sum_XY + X[i] * Y[i];
 
        // sum of square of array elements.
        squareSum_X = squareSum_X + X[i] * X[i];
        squareSum_Y = squareSum_Y + Y[i] * Y[i];
    }
    // use formula for calculating correlation 
    // coefficient.
    float corr = (float)(n * sum_XY - sum_X * sum_Y)/
                 (float)(Math.sqrt((n * squareSum_X -
                 sum_X * sum_X) * (n * squareSum_Y - 
                 sum_Y * sum_Y)));
    return corr;
}
// Driver function
public static void main(String args[])
{
    int X[] = {15, 18, 21, 24, 27};
    int Y[] = {25, 25, 27, 31, 32};
 
    // Find the size of array.
    int n = X.length;
 
    // Function call to correlationCoefficient.
    System.out.printf("%6f",
             correlationCoefficient(X, Y, n));
}

}

Python

Python Program to find correlation coefficient.

import math

function that returns correlation coefficient.

def correlationCoefficient(X, Y, n) : sum_X = 0 sum_Y = 0 sum_XY = 0 squareSum_X = 0 squareSum_Y = 0

i = 0
while i < n :
    # sum of elements of array X.
    sum_X = sum_X + X[i]
    
    # sum of elements of array Y.
    sum_Y = sum_Y + Y[i]
    
    # sum of X[i] * Y[i].
    sum_XY = sum_XY + X[i] * Y[i]
    
    # sum of square of array elements.
    squareSum_X = squareSum_X + X[i] * X[i]
    squareSum_Y = squareSum_Y + Y[i] * Y[i]
    
    i = i + 1
 
# use formula for calculating correlation 
# coefficient.
corr = (float)(n * sum_XY - sum_X * sum_Y)/
       (float)(math.sqrt((n * squareSum_X - 
       sum_X * sum_X)* (n * squareSum_Y - 
       sum_Y * sum_Y)))
return corr

Driver function

X = [15, 18, 21, 24, 27] Y = [25, 25, 27, 31, 32]

Find the size of array.

n = len(X)

Function call to correlationCoefficient.

print ('{0:.6f}'.format(correlationCoefficient(X, Y, n)))

C#

// C# Program to find correlation coefficient using System; class GFG { // function that returns correlation coefficient. static float correlationCoefficient(int []X, int []Y, int n) { int sum_X = 0, sum_Y = 0, sum_XY = 0; int squareSum_X = 0, squareSum_Y = 0;

    for (int i = 0; i < n; i++)
    {
        // sum of elements of array X.
        sum_X = sum_X + X[i];
  
        // sum of elements of array Y.
        sum_Y = sum_Y + Y[i];
  
        // sum of X[i] * Y[i].
        sum_XY = sum_XY + X[i] * Y[i];
  
        // sum of square of array elements.
        squareSum_X = squareSum_X + X[i] * X[i];
        squareSum_Y = squareSum_Y + Y[i] * Y[i];
    }
    // use formula for calculating correlation 
    // coefficient.
    float corr = (float)(n * sum_XY - sum_X * sum_Y)/
                 (float)(Math.Sqrt((n * squareSum_X -
                 sum_X * sum_X) * (n * squareSum_Y - 
                 sum_Y * sum_Y)));
  
    return corr;
}
// Driver function
public static void Main()
{
    int []X = {15, 18, 21, 24, 27};
    int []Y = {25, 25, 27, 31, 32};
  
    // Find the size of array.
    int n = X.Length;
  
    // Function call to correlationCoefficient.
    Console.Write(Math.Round(correlationCoefficient(X, Y, n) *
                                        1000000.0)/1000000.0);
}

}

JavaScript

PHP

Y,Y, Y,n) { sumX=0;sum_X = 0;sumX=0;sum_Y = 0; $sum_XY = 0; squareSumX=0;squareSum_X = 0; squareSumX=0;squareSum_Y = 0; for ($i = 0; i<i < i<n; $i++) { // sum of elements of array X. sumX=sum_X = sumX=sum_X + X[X[X[i]; // sum of elements of array Y. sumY=sum_Y = sumY=sum_Y + Y[Y[Y[i]; // sum of X[i] * Y[i]. sumXY=sum_XY = sumXY=sum_XY + X[X[X[i] * Y[Y[Y[i]; // sum of square of array elements. squareSumX=squareSum_X = squareSumX=squareSum_X + X[X[X[i] * X[X[X[i]; squareSumY=squareSum_Y = squareSumY=squareSum_Y + Y[Y[Y[i] * Y[Y[Y[i]; } // use formula for calculating // correlation coefficient. corr=(float)(corr = (float)(corr=(float)(n * sumXY−sum_XY - sumXYsum_X * $sum_Y) / sqrt(($n * squareSumX−squareSum_X - squareSumXsum_X * $sum_X) * ($n * squareSumY−squareSum_Y - squareSumYsum_Y * $sum_Y)); return $corr; } // Driver Code $X = array (15, 18, 21, 24, 27); $Y = array (25, 25, 27, 31, 32); //Find the size of array. n=sizeof(n = sizeof(n=sizeof(X); //Function call to // correlationCoefficient. echo correlationCoefficient($X, Y,Y, Y,n); // This code is contributed by aj_36 ?>

`

**Complexity Analysis

This efficient approach enables quick computation of the correlation coefficient, helping you analyze relationships between datasets. Whether in statistics, finance, or other domains, understanding correlation is essential for data-driven decision-making.