The fallacy of placing confidence in confidence intervals - PubMed (original) (raw)

The fallacy of placing confidence in confidence intervals

Richard D Morey et al. Psychon Bull Rev. 2016 Feb.

Abstract

Interval estimates - estimates of parameters that include an allowance for sampling uncertainty - have long been touted as a key component of statistical analyses. There are several kinds of interval estimates, but the most popular are confidence intervals (CIs): intervals that contain the true parameter value in some known proportion of repeated samples, on average. The width of confidence intervals is thought to index the precision of an estimate; CIs are thought to be a guide to which parameter values are plausible or reasonable; and the confidence coefficient of the interval (e.g., 95 %) is thought to index the plausibility that the true parameter is included in the interval. We show in a number of examples that CIs do not necessarily have any of these properties, and can lead to unjustified or arbitrary inferences. For this reason, we caution against relying upon confidence interval theory to justify interval estimates, and suggest that other theories of interval estimation should be used instead.

Keywords: Bayesian inference and parameter estimation; Bayesian statistics; Statistical inference; Statistics.

PubMed Disclaimer

Figures

Fig. 1

Fig. 1

Submersible rescue attempts. Note that likelihood and CIs are depicted from bottom to top in the order in which they are described in the text. See text for details. An interactive version of this figure is available at

http://learnbayes.org/redirects/CIshiny1.html

Fig. 2

Fig. 2

Left: Possible locations of the first (y 1) and second (y 2) bubbles. Right: y 2−y 1 plotted against the mean of y 1 and y 2. Shaded regions show the areas where the respective 50 % confidence interval contains the true value. The figures in the top row (a, b) show the sampling distribution interval; the middle row (c, d) shows the NP and UMP intervals; the bottom row (e, f) shows the Bayes interval. Points ‘a’ and ‘b’ represent the pairs of bubbles from Fig. 1a and b, respectively. An interactive version of this figure is available at

http://learnbayes.org/redirects/CIshiny1.html

Fig. 3

Fig. 3

The relationship between CI width and the uncertainty in the estimation of the hatch location for the four confidence procedures. SD: Sampling distribution procedure; NP: Nonparametric procedure; UMP: UMP procedure; B: Bayes procedure. Note that the NP and UMP procedures overlap when the width of the likelihood is >5. An interactive version of this figure is available at

http://learnbayes.org/redirects/CIshiny1.html

Fig. 4

Fig. 4

The probability that each confidence procedure includes false values 𝜃′. T: Trivial procedure; SD: Sampling distribution procedure NP: Nonparametric procedure; UMP: UMP procedure; B: Bayes procedure. The line for the sampling distribution procedure (dashed line) is between the lines for the Bayes procedure and the UMP procedure. An interactive version of this figure is available at

http://learnbayes.org/redirects/CIshiny1.html

Fig. 5

Fig. 5

Forming Bayesian credible intervals. Prior information (top) is combined with the likelihood information from the data (middle) to yield a posterior distribution (bottom). In the likelihood plots, the shaded regions show the locations within 5 meters of each bubble; the dark shaded regions show where these overlap, indicating the possible location of the hatch θ. In the posterior plots, the central 50 % region (shaded region within posterior) shows one possible 50 % credible interval, the central credible interval. An interactive version of this figure is available at

http://learnbayes.org/redirects/CIshiny1.html

Fig. 6

Fig. 6

Building a confidence interval by inverting a significance test. A: Two noncentral F distributions, with true ω 2 = .1 (blue solid line) and true ω 2 = .2 (dashed gray line). When F(2,27)=5, the upper-tailed p value for these tests are .16 and .42, respectively. B: Two noncentral F distributions, with true ω 2 = .36 (red solid line) and true ω 2 = .2 (dashed gray line). When F(2,27)=5, the lower-tailed p value for these tests are .16 and .58, respectively

Fig. 7

Fig. 7

Likelihoods, confidence intervals, and Bayesian credible intervals (highest posterior density, or HPD, intervals) for four hypothetical experimental results. In each figure, the top interval is Steiger’s (2004) confidence interval for ω 2; the bottom interval is the Bayesian HPD. See text for details

Similar articles

Cited by

References

    1. Anastasi A, Urbina S. Psychological testing. 7th ed. London: Prentice-Hall International; 1997.
    1. Basu D. On ancillary statistics, pivotal quantities, and confidence statements. In: Chaubey Y P, Dwivedi T D, editors. Topics in applied statistics. Montreal: Concordia University; 1981. pp. 1–29.
    1. Berger JO. Bayes factors. In: Kotz S, Balakrishnan N, Read C, Vidakovic B, Johnson N L, editors. Encyclopedia of statistical sciences (Second edition) Hoboken, New Jersey: John Wiley & Sons; 2006. pp. 378–386.
    1. Berger JO, Wolpert RL. The likelihood principle (2nd ed.) Hayward. CA: Institute of Mathematical Statistics; 1988.
    1. Blaker H, Spjøtvoll E. Paradoxes and improvements in interval estimation. The American Statistician. 2000;54(4):242–247.

Publication types

MeSH terms

LinkOut - more resources