Empirical Distribution Function (EDF) (original) (raw)

Last Updated : 23 Jul, 2025

The Empirical Distribution Function is a method used to estimate the cumulative distribution function (CDF) based on a sample. It provides an estimate of the proportion of data points in the sample that are less than or equal to a particular value. The theoretical CDF is based on assumptions about the population distribution whereas the EDF is derived directly from observed data which makes it flexible and widely applicable to real-world datasets. EDF is defined as:

F_n(x) = \frac{\text{Number of points} \leq x}{n}

Where:

In simplified terms the EDF at any given point x represents the fraction of data points that are smaller than or equal to that value.

Key Characteristics

Step-by-Step Process for Calculating the EDF

**1. Sort the Data: Sort the values in increasing order.

**2. Compute the EDF: For each sorted value (x_i), compute the cumulative probability:

F_n(x_i) = \frac{i}{n}

Where i is the rank of the data point x_i in the sorted list, i.e., the number of values less than or equal to x_i.

**3. Graph the EDF: The EDF can be plotted as a step function, with the sorted data values on the x-axis and the cumulative probability on the y-axis.

Example-graph-to-shoe-EDF-plotting

Example to show EDF Plotting

Example

**1. Data Sample: Suppose we have a dataset of 5 values [2,9,12,7,5].

**2. Sort the Data: Sort the data in ascending order [2,5,7,9,12].

**3. Compute the EDF: Lets calculate the EDF for each sorted value,

F_n(2) = \frac{1}{5} = 0.2

F_n(5) = \frac{2}{5} = 0.4

F_n(7) = \frac{3}{5} = 0.6

F_n(9) = \frac{4}{5} = 0.28

F_n(12) = \frac{5}{5} = 1.0

**4. Plot the EDF:

EDF_example_RESULT

Representation of the Result of EDF on given dataset

Applications of the EDF

comparison-of-EDF

Example to show EDF Comparison

Limitations of EDF