Internal validation of predictive models: efficiency of some procedures for logistic regression analysis - PubMed (original) (raw)
Internal validation of predictive models: efficiency of some procedures for logistic regression analysis
E W Steyerberg et al. J Clin Epidemiol. 2001 Aug.
Abstract
The performance of a predictive model is overestimated when simply determined on the sample of subjects that was used to construct the model. Several internal validation methods are available that aim to provide a more accurate estimate of model performance in new subjects. We evaluated several variants of split-sample, cross-validation and bootstrapping methods with a logistic regression model that included eight predictors for 30-day mortality after an acute myocardial infarction. Random samples with a size between n = 572 and n = 9165 were drawn from a large data set (GUSTO-I; n = 40,830; 2851 deaths) to reflect modeling in data sets with between 5 and 80 events per variable. Independent performance was determined on the remaining subjects. Performance measures included discriminative ability, calibration and overall accuracy. We found that split-sample analyses gave overly pessimistic estimates of performance, with large variability. Cross-validation on 10% of the sample had low bias and low variability, but was not suitable for all performance measures. Internal validity could best be estimated with bootstrapping, which provided stable estimates with low bias. We conclude that split-sample validation is inefficient, and recommend bootstrapping for estimation of internal validity of a predictive logistic regression model.
Similar articles
- Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models.
Austin PC, Steyerberg EW. Austin PC, et al. Stat Methods Med Res. 2017 Apr;26(2):796-808. doi: 10.1177/0962280214558972. Epub 2014 Nov 19. Stat Methods Med Res. 2017. PMID: 25411322 Free PMC article. - Stepwise selection in small data sets: a simulation study of bias in logistic regression analysis.
Steyerberg EW, Eijkemans MJ, Habbema JD. Steyerberg EW, et al. J Clin Epidemiol. 1999 Oct;52(10):935-42. doi: 10.1016/s0895-4356(99)00103-1. J Clin Epidemiol. 1999. PMID: 10513756 - Internal and external validation of predictive models: a simulation study of bias and precision in small samples.
Steyerberg EW, Bleeker SE, Moll HA, Grobbee DE, Moons KG. Steyerberg EW, et al. J Clin Epidemiol. 2003 May;56(5):441-7. doi: 10.1016/s0895-4356(03)00047-7. J Clin Epidemiol. 2003. PMID: 12812818 - Sutureless Aortic Valve Replacement for Treatment of Severe Aortic Stenosis: A Single Technology Assessment of Perceval Sutureless Aortic Valve [Internet].
Desser AS, Arentz-Hansen H, Fagerlund BF, Harboe I, Lauvrak V. Desser AS, et al. Oslo, Norway: Knowledge Centre for the Health Services at The Norwegian Institute of Public Health (NIPH); 2017 Aug 25. Report from the Norwegian Institute of Public Health No. 2017-01. Oslo, Norway: Knowledge Centre for the Health Services at The Norwegian Institute of Public Health (NIPH); 2017 Aug 25. Report from the Norwegian Institute of Public Health No. 2017-01. PMID: 29553663 Free Books & Documents. Review. - Establishment of Best Practices for Evidence for Prediction: A Review.
Poldrack RA, Huckins G, Varoquaux G. Poldrack RA, et al. JAMA Psychiatry. 2020 May 1;77(5):534-540. doi: 10.1001/jamapsychiatry.2019.3671. JAMA Psychiatry. 2020. PMID: 31774490 Free PMC article. Review.
Cited by
- Improved small-sample estimation of nonlinear cross-validated prediction metrics.
Benkeser D, Petersen M, van der Laan MJ. Benkeser D, et al. J Am Stat Assoc. 2020;115(532):1917-1932. doi: 10.1080/01621459.2019.1668794. Epub 2019 Oct 21. J Am Stat Assoc. 2020. PMID: 33716360 Free PMC article. - ALICE: a hybrid AI paradigm with enhanced connectivity and cybersecurity for a serendipitous encounter with circulating hybrid cells.
Cheng KS, Pan R, Pan H, Li B, Meena SS, Xing H, Ng YJ, Qin K, Liao X, Kosgei BK, Wang Z, Han RPS. Cheng KS, et al. Theranostics. 2020 Sep 2;10(24):11026-11048. doi: 10.7150/thno.44053. eCollection 2020. Theranostics. 2020. PMID: 33042268 Free PMC article. - Using decision trees to characterize verbal communication during change and stuck episodes in the therapeutic process.
Masías VH, Krause M, Valdés N, Pérez JC, Laengle S. Masías VH, et al. Front Psychol. 2015 Apr 9;6:379. doi: 10.3389/fpsyg.2015.00379. eCollection 2015. Front Psychol. 2015. PMID: 25914657 Free PMC article. - A Risk Prediction Index for Advanced Colorectal Neoplasia at Screening Colonoscopy.
Schroy PC 3rd, Wong JB, O'Brien MJ, Chen CA, Griffith JL. Schroy PC 3rd, et al. Am J Gastroenterol. 2015 Jul;110(7):1062-71. doi: 10.1038/ajg.2015.146. Epub 2015 May 26. Am J Gastroenterol. 2015. PMID: 26010311 Free PMC article. - Diagnostic accuracy of age and alarm symptoms for upper GI malignancy in patients with dyspepsia in a GI clinic: a 7-year cross-sectional study.
Khademi H, Radmard AR, Malekzadeh F, Kamangar F, Nasseri-Moghaddam S, Johansson M, Byrnes G, Brennan P, Malekzadeh R. Khademi H, et al. PLoS One. 2012;7(6):e39173. doi: 10.1371/journal.pone.0039173. Epub 2012 Jun 13. PLoS One. 2012. PMID: 22720064 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical