Z test in R (original) (raw)

Last Updated : 23 Jul, 2025

Z test is a popular parametric test used for hypothesis testing. It assumes that the data follows a normal distribution and relies on known population parameters such as the mean and standard deviation. The Z-test helps determine if there is a significant difference between the sample mean and the population mean or between the means of two samples. It is particularly useful when dealing with large sample sizes (typically greater than 30) and the population standard deviation is known.

The Z-value serves as a threshold to decide whether to accept or reject the hypothesis. There are two main types of Z-tests:

**One-Sample Z-test: Compares the sample mean to the population mean.
**Two-Sample Z-test: Compares the means of two independent samples.

The Z-test follows a normal distribution and is commonly used when the sample size is large and the data meets parametric assumptions.

1. One Sample Z-test

Here Z Test is applicable on one sample that has been taken from the population. The formula is as follows:

Z = \frac{{\bar{X} - \mu}}{{\frac{\sigma}{\sqrt{n}}}}

Here,

Z denotes the Z value
\bar{X} is the sample mean
\mu denotes mean of the population
\sigma denotes population standard deviation
n denotes sample size.

2. Two sample Z test

Here Z Test is applicable on two samples that has been taken from the population. The formula is as follows:

Z = \frac{{\bar{X}_1 - \bar{X}_2}}{{\sqrt{\frac{{s_1^2}}{{n_1}} + \frac{{s_2^2}}{{n_2}}}}}

Here,

{\bar X_1}\space\text{and}\space{\bar X_2} are the sample means.
s_1\space \text{and} \space s_2 are standard deviations of the two samples.
n_1\space \text{and} \space n_2 are sample sizes of two samples.

**When to apply Z-test

The **Z-test is commonly applied in the following situations:

**1. When the Population Standard Deviation is Known

The Z-test is used when we know the population's standard deviation and are comparing a sample mean to the population mean or comparing the means of two independent samples. For example, if the average height of a population is known, the Z-test can be used to determine whether a sample of individuals has a significantly different average height.

**2. When the Sample Size is Large

The Z-test is most reliable when working with large sample sizes, typically when n>30n > 30. As the sample size increases, the sampling distribution of the sample mean approximates a normal distribution (Central Limit Theorem), making the Z-test a suitable choice for larger samples.

Implementation of Z test in R

We can Implement the Z-test in R using the BSDA library. We can install and load this library in R using the install.packages() and library() function.

R `

install.packages("BSDA") library(BSDA)

The **syntax of z- test in R is:

z.test(x, y, alternative='two.sided', mu=0, sigma.x=NULL, sigma.y=NULL,conf.level=.95)

**Where:

**mu is the population mean under the null hypothesis.
**sigma.x is the known population standard deviation.

1. One-Sample Z-test in R

We are loading the BSDA package, which provides statistical functions like z.test() for performing Z-tests. We then create a sample dataset (sample_data) containing 10 data points, representing a sample from a population. Next, we conduct a one-sample Z-test using the z.test() function, where we compare the sample mean against a hypothesized population mean (24) and we use a known population standard deviation of 10. Finally, we print the results of the Z-test, which include the test statistic, p-value and other statistical information to assess whether the sample mean significantly differs from the hypothesized population mean.

R `

library(BSDA) sample_data <- c(26, 25, 10, 34, 30, 23, 28, 29, 25, 27)

z_test <- z.test(sample_data, mu = 24,sigma.x=10)

print(z_test)

**Output:

one-z-test

One Sample Z-test

We can see that:

**z = 0.53759: The Z-test statistic, indicating how far the sample mean is from the population mean in terms of standard deviations.
**p-value = 0.5909: The p-value is much higher than 0.05, meaning the difference is **not statistically significant.
**Alternative hypothesis: true mean is not equal to 24: We fail to reject this hypothesis due to the high p-value.
**95% confidence interval: [19.5, 31.9]: We are 95% confident the true mean lies between 19.5 and 31.9.
**Sample mean = 25.7: The average of the sample data is 25.7.

Therefore ,there’s no strong evidence to suggest that the sample mean significantly differs from the population mean of 24.

2. Two sample Z-Test in R

We are performing a one-sample Z-test to compare the sample mean of the dataset (sample_data) with the hypothesized population mean (24). We use the z.test() function from the BSDA package, specifying the population standard deviation (sigma.x = 10). The output provides the Z-test statistic, p-value, 95% confidence interval for the true mean and the sample mean. Based on the p-value, we determine whether the difference between the sample and population means is statistically significant.

R `

library(BSDA)

data1 <- c(27, 24, 18, 29, 30,27) data2 <- c(23, 28, 20, 19, 35,23)

z_test_result <- z.test(data1,data2,mu=26,sigma.x=10,sigma.y=15)

print(z_test_result)

**Output:

two-z-test

Two-Sample Z-test

We can see that:

**z = -3.3742: The Z-test statistic, showing the difference between the sample means is **3.37 standard deviations from the hypothesized difference of 26.
**p-value = 0.0007403: The p-value is **very small (< 0.05), indicating the difference is **statistically significant.
**Alternative hypothesis: true difference in means is not equal to 26: Since the p-value is low, we reject the null hypothesis and conclude the difference is not 26.
**95% confidence interval: [-13.26, 15.59]: We are 95% confident that the true difference in means lies between **-13.26 and **15.59.
**Sample means: The mean of data1 is **25.83 and the mean of data2 is **24.67.

Therefore, the difference between the two sample means is statistically significant and not equal to 26.

In this article, we explored the Z-test in R, including how to perform one-sample and two-sample Z-tests, interpret the results and determine statistical significance.