Pearson Correlation Testing in R Programming (original) (raw)

Last Updated : 6 Aug, 2025

Pearson correlation is a parametric statistical method used to measure the linear relationship between two continuous variables. It indicates both the strength and direction of the relationship and returns a value between -1 and +1. In R Programming Language it is used to analyze the association between two normally distributed variables.

There are mainly two types of correlation:

  1. **Parametric Correlation: It measures a linear dependence between two variables (x and y) is known as a parametric correlation test because it depends on the distribution of the data.
  2. **Non-Parametric Correlation: They are rank-based correlation coefficients and are known as non-parametric correlation.

**Pearson Correlation Formula:

\displaystyle r = \frac { \Sigma(x – m_x)(y – m_y) }{\sqrt{\Sigma(x – m_x)^2 \Sigma(y – m_y)^2}}

**Parameters:

Implementation of Pearson Correlation Testing

We implement Pearson correlation testing in R using two primary functions:

1. Calculating the Correlation Coefficient Using cor()

We calculate the Pearson correlation coefficient between two numeric vectors using the cor() function.

x = c(1, 2, 3, 4, 5, 6, 7) y = c(1, 3, 6, 2, 7, 4, 5) result = cor(x, y, method = "pearson") cat("Pearson correlation coefficient is:", result)

`

**Output:

Pearson correlation coefficient is: 0.5357143

2. Performing Correlation Test Using cor.test()

We perform the Pearson correlation test which returns the coefficient, p-value and confidence interval.

x = c(1, 2, 3, 4, 5, 6, 7) y = c(1, 3, 6, 2, 7, 4, 5) result = cor.test(x, y, method = "pearson") print(result)

`

**Output:

Pearson

Output

In the output above:

Implementation for Statistical Significance

We test the statistical significance of correlations using the rcorr function and visualize relationships using ggplot2.

1. Installing and Loading Required Packages

We first install and then load the required packages. We use the built-in mtcars dataset.

install.packages("ggplot2") install.packages("Hmisc") install.packages("corrplot")

library(ggplot2) library(Hmisc) library(corrplot) data("mtcars")

`

2. Pearson Correlation Testing

We use the rcorr function to calculate Pearson correlation and p-values. It requires data in matrix form.

cor_test <- rcorr(as.matrix(mtcars[, c("mpg", "wt", "hp", "disp")]), type = "pearson") cor_test$r cor_test$P

`

**Output:

matrix

Output

3. Scatter Plot with Regression Line

We use ggplot2 to show the correlation between two variables with a regression line.

ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point(color = "blue", size = 2) + geom_smooth(method = "lm", color = "red", se = FALSE) + labs(title = "Scatter Plot with Pearson Correlation", x = "Weight (wt)", y = "Miles Per Gallon (mpg)") + theme_minimal()

`

**Output:

scatter_plot

Output

The scatter plot shows a strong negative correlation between weight and mileage, where heavier cars tend to have lower miles per gallon, as indicated by the downward-sloping red regression line