Geometric Distribution in R (original) (raw)

Last Updated : 15 Jul, 2025

The **geometric distribution in R is one of the fundamental discrete probability distributions in statistics. It models the number of trials required to get the first success in a sequence of independent Bernoulli trials (i.e., trials with two possible outcomes: success and failure). In this article, we will explore the theory behind the geometric distribution, its probability mass function (PMF), cumulative distribution function (CDF), and its applications. We will also cover how to work with the geometric distribution in R with practical examples.

The **geometric distribution in R arises when we perform a sequence of independent and identically distributed Bernoulli trials, where each trial has:

Mathematically, the **probability mass function (PMF) of a geometric distribution is given by:

P(X = k) = (1 - p)^{k - 1} \cdot p

**Where:

Applications of the Geometric Distribution in R

The geometric distribution is useful in various real-world scenarios, such as:

Working with the Geometric Distribution in R

R provides built-in functions to handle the geometric distribution in R its including generating random samples, calculating probabilities, and plotting the distribution. Functions in R for the Geometric Distribution

  1. dgeom(x, prob): Computes the probability mass function (PMF) at point x.
  2. pgeom(q, prob): Computes the cumulative distribution function (CDF) up to q.
  3. qgeom(p, prob): Computes the quantile function (inverse of the CDF) for a given probability p.
  4. rgeom(n, prob): Generates nnn random variates from the geometric distribution with success probability prob.

Now we will discuss all the function in detail to calculate Geometric Distribution in R Programming Language.

1: Calculate Probability Mass Function (PMF) using dgeom()

The dgeom() function computes the probability mass function (PMF), i.e., the probability of getting the first success on the k-th trial. Let’s compute and visualize the PMF for k=1,2,…,10 with p=0.3 (probability of success).

R `

Load necessary libraries

library(ggplot2)

Parameters

p <- 0.3 k_values <- 1:10

Calculate PMF

pmf_values <- dgeom(k_values - 1, prob = p)

Create a data frame for plotting

pmf_data <- data.frame(k = k_values, pmf = pmf_values)

Plot the PMF

ggplot(pmf_data, aes(x = k, y = pmf)) + geom_bar(stat = "identity", fill = "blue") + labs(title = "Geometric Distribution PMF (p = 0.3)", x = "Number of Trials (k)", y = "Probability") + theme_minimal()

`

**Output:

gh

Calculate Probability Mass Function (PMF) using dgeom()

This plot shows the probabilities of getting the first success on each trial.

2: Calculate Cumulative Distribution Function (CDF) using pgeom()

The pgeom() function computes the cumulative probability of getting the first success on or before the q-th trial. Let’s calculate and plot the CDF for q=1,2,…,10 with p=0.3.

R `

Calculate CDF

cdf_values <- pgeom(k_values - 1, prob = p)

Create a data frame for plotting

cdf_data <- data.frame(k = k_values, cdf = cdf_values)

Plot the CDF

ggplot(cdf_data, aes(x = k, y = cdf)) + geom_line(color = "green", size = 1) + geom_point(color = "green", size = 2) + labs(title = "Geometric Distribution CDF (p = 0.3)", x = "Number of Trials (q)", y = "Cumulative Probability") + theme_minimal()

`

**Output:

gh

Calculate Cumulative Distribution Function (CDF) using pgeom()

This plot shows the cumulative probability of getting the first success on or before each trial.

3: Calculate Quantile Function (Inverse CDF) using qgeom()

The qgeom() function calculates the number of trials qqq needed to achieve a given cumulative probability. Let’s compute the number of trials for different probabilities p=0.1,0.2,…,0.9 with p=0.3.

R `

Probabilities for quantile function

prob_values <- seq(0.1, 0.9, by = 0.1)

Calculate quantiles

quantile_values <- qgeom(prob_values, prob = p)

Create a data frame for plotting

quantile_data <- data.frame(p = prob_values, quantile = quantile_values)

Plot the quantile function

ggplot(quantile_data, aes(x = p, y = quantile)) + geom_line(color = "purple", size = 1) + geom_point(color = "purple", size = 2) + labs(title = "Geometric Distribution Quantile Function (p = 0.3)", x = "Cumulative Probability (p)", y = "Number of Trials (q)") + theme_minimal()

`

**Output:

gh

Calculate Quantile Function (Inverse CDF) using qgeom()

This plot shows how many trials are expected to get the first success for different cumulative probabilities.

4: Random Sampling from Geometric Distribution using rgeom()

The rgeom() function generates random samples from a geometric distribution. Let’s generate 1000 random samples and visualize their distribution using a histogram.

R `

Generate random samples

n <- 1000 random_samples <- rgeom(n, prob = p)

Create a data frame for plotting

random_data <- data.frame(samples = random_samples)

Plot histogram of random samples

ggplot(random_data, aes(x = samples)) + geom_histogram(binwidth = 1, fill = "pink", color = "black") + labs(title = "Histogram of Random Samples from Geometric Distribution (p = 0.3)", x = "Number of Trials", y = "Frequency") + theme_minimal()

`

**Output:

gh

Random Sampling from Geometric Distribution using rgeom()

This histogram shows the frequency distribution of the number of trials required to get the first success in 1000 random samples.