arviz.plot_ecdf — ArviZ dev documentation (original) (raw)
arviz.plot_ecdf(values, values2=None, eval_points=None, cdf=None, difference=False, confidence_bands=False, ci_prob=None, num_trials=500, rvs=None, random_state=None, figsize=None, fill_band=True, plot_kwargs=None, fill_kwargs=None, plot_outline_kwargs=None, ax=None, show=None, backend=None, backend_kwargs=None, npoints=100, pointwise=False, fpr=None, pit=False, **kwargs)[source]#
Plot ECDF or ECDF-Difference Plot with Confidence bands.
Plots of the empirical cumulative distribution function (ECDF) of an array. Optionally, A cdf
argument representing a reference CDF may be provided for comparison using a difference ECDF plot and/or confidence bands.
Alternatively, the PIT for a single dataset may be visualized.
Parameters:
valuesarray_like
Values to plot from an unknown continuous or discrete distribution.
values2array_like, optional
values to compare to the original sample.
Deprecated since version 0.18.0: Instead use cdf=scipy.stats.ecdf(values2).cdf.evaluate
.
cdfcallable(), optional
Cumulative distribution function of the distribution to compare the original sample. The function must take as input a numpy array of draws from the distribution.
If True then plot ECDF-difference plot otherwise ECDF plot.
- False: No confidence bands are plotted (default).
- True: Plot bands computed with the default algorithm (subject to change)
- “pointwise”: Compute the pointwise (i.e. marginal) confidence band.
- “optimized”: Use optimization to estimate a simultaneous confidence band.
- “simulated”: Use Monte Carlo simulation to estimate a simultaneous confidence band.
For simultaneous confidence bands to be correctly calibrated, provide eval_points
that are not dependent on the values
.
ci_probfloat, default 0.94
The probability that the true ECDF lies within the confidence band. If confidence_bands
is “pointwise”, this is the marginal probability instead of the joint probability.
eval_pointsarray_like, optional
The points at which to evaluate the ECDF. If None, npoints
uniformly spaced points between the data bounds will be used.
rvs: callable, optional
A function that takes an integer ndraws
and optionally the object passed torandom_state
and returns an array of ndraws
samples from the same distribution as the original dataset. Required if method
is “simulated” and variable is discrete.
random_stateint, numpy.random.Generator or numpy.random.RandomState, optional
num_trialsint, default 500
The number of random ECDFs to generate for constructing simultaneous confidence bands (if confidence_bands
is “simulated”).
figsize(float,float), optional
Figure size. If None
it will be defined automatically.
If True it fills in between to mark the area inside the confidence interval. Otherwise, plot the border lines.
plot_kwargsdict, optional
Additional kwargs passed to matplotlib.pyplot.step() orbokeh.plotting.figure.step()
fill_kwargsdict, optional
Additional kwargs passed to matplotlib.pyplot.fill_between() orbokeh:bokeh.plotting.Figure.varea()
plot_outline_kwargsdict, optional
Additional kwargs passed to matplotlib.axes.Axes.plot() orbokeh:bokeh.plotting.Figure.line()
ax :axes, optional
Matplotlib axes or bokeh figures.
showbool, optional
Call backend show function.
backend{“matplotlib”, “bokeh”}, default “matplotlib”
Select plotting backend.
backend_kwargsdict, optional
These are kwargs specific to the backend being used, passed tomatplotlib.pyplot.subplots() or bokeh.plotting.figure. For additional documentation check the plotting method of the backend.
npointsint, default 100
The number of evaluation points for the ecdf or ecdf-difference plots, if eval_points
is not provided or pit
is True
.
Deprecated since version 0.18.0: Instead specify eval_points=np.linspace(np.min(values), np.max(values), npoints)
unless pit
is True
.
Deprecated since version 0.18.0: Instead use confidence_bands="pointwise"
.
fprfloat, optional
Deprecated since version 0.18.0: Instead use ci_prob=1-fpr
.
If True plots the ECDF or ECDF-diff of PIT of sample.
Deprecated since version 0.18.0: See below example instead.
Returns:
axesmatplotlib Axes or Bokeh Figure
Notes
This plot computes the confidence bands with the simulated based algorithm presented in [1].
References
[1]
Säilynoja, T., Bürkner, P.C. and Vehtari, A. (2022). Graphical Test for Discrete Uniformity and its Applications in Goodness of Fit Evaluation and Multiple Sample Comparison. Statistics and Computing, 32(32).
Examples
In a future release, the default behaviour of plot_ecdf
will change. To maintain the original behaviour you should do:
import arviz as az import numpy as np from scipy.stats import uniform, norm
sample = norm(0,1).rvs(1000) npoints = 100 az.plot_ecdf(sample, eval_points=np.linspace(sample.min(), sample.max(), npoints))
However, seeing this warning isn’t an indicator of anything being wrong, if you are happy to get different behaviour as ArviZ improves and adds new algorithms you can ignore it like so:
import warnings warnings.filterwarnings("ignore", category=az.utils.BehaviourChangeWarning)
Plot an ECDF plot for a given sample evaluated at the sample points. This will become the new behaviour when eval_points
is not provided:
az.plot_ecdf(sample, eval_points=np.unique(sample))
Plot an ECDF plot with confidence bands for comparing a given sample to a given distribution. We manually specify evaluation points independent of the values so that the confidence bands are correctly calibrated.
distribution = norm(0,1) eval_points = np.linspace(*distribution.ppf([0.001, 0.999]), 100) az.plot_ecdf( sample, eval_points=eval_points, cdf=distribution.cdf, confidence_bands=True )
Plot an ECDF-difference plot with confidence bands for comparing a given sample to a given distribution.
az.plot_ecdf( sample, cdf=distribution.cdf, confidence_bands=True, difference=True )
Plot an ECDF plot with confidence bands for the probability integral transform (PIT) of a continuous sample. If drawn from the reference distribution, the PIT values should be uniformly distributed.
pit_vals = distribution.cdf(sample) uniform_dist = uniform(0, 1) az.plot_ecdf( pit_vals, cdf=uniform_dist.cdf, confidence_bands=True, )
Plot an ECDF-difference plot of PIT values.
az.plot_ecdf( pit_vals, cdf = uniform_dist.cdf, confidence_bands = True, difference = True )