ChiSquare Distribution in NumPy (original) (raw)

Chi-Square Distribution in NumPy

Last Updated : 22 Apr, 2025

The **Chi-Square Distribution is used in statistics when we add up the squares of independent random numbers that follow a standard normal distribution. It is used in hypothesis testing to check whether observed data fits a particular distribution or not. In Python you can use the numpy.random.chisquare() function to generate random numbers that follow Chi-Square Distribution.

**Syntax: numpy.random.chisquare(df, size=None)

**df: Degrees of freedom (denoted by k) which affects the shape of the distribution.
**size: The number of random numbers you want to generate or the shape of the returned array.

Example 1: Generate a Single Random Number

To generate a single random number from a Chi-Square Distribution with __df_=2 (degrees of freedom):

Python `

import numpy as np

random_number = np.random.chisquare(df=2) print(random_number)

**Output :

4.416454073420925

Example 2: Generate an Array of Random Numbers

To generate multiple random numbers:

Python `

random_numbers = np.random.chisquare(df=2, size=5) print(random_numbers)

**Output :

[0.66656494 3.55985755 1.78678662 1.53405371 4.61716372]

Visualizing the Chi-Square Distribution

Visualizing the generated numbers helps to understand the behavior of the Chi-Square distribution. You can plot a histogram or a density plot using libraries like Matplotlib and Seaborn.

Python `

import numpy as np import matplotlib.pyplot as plt import seaborn as sns

df = 1
size = 1000

data = np.random.chisquare(df=df, size=size)

sns.displot(data, kind="kde", color='purple', label=f'Chi-Square (df={df})')

plt.title(f"Chi-Square Distribution (df={df})") plt.xlabel("Value") plt.ylabel("Density") plt.legend() plt.grid(True)

plt.show()

**Output:

ChiSquare-Distribution

Chi-Square Distribution

The above chart shows the **shape of the Chi-Square distribution for df = 1:

The **x-axis represents the values generated.
The **y-axis shows the **density (how often values occur).
With df = 1 the curve is **skewed to the right meaning lower values occur more frequently and higher values become rarer.