Receiver Operator Characteristic Analysis of Biomarkers Evaluation in Diagnostic Research (original) (raw)

A Comparison of Parametric and Nonparametric Approaches to ROC Analysis of Quantitative Diagnostic Tests

Medical Decision Making, 1997

Receiver operating characteristic (ROC) analysis, which yields indices of accuracy such as the area under the curve (AUC), is increasingly being used to evaluate the performances of diagnostic tests that produce results on continuous scales. Both parametric and nonparametric ROC approaches are available to assess the discriminant capacity of such tests, but there are no clear guidelines as to the merits of each, particularly with non-binormal data. Investigators may worry that when data are non-Gaussian, estimates of diagnostic accuracy based on a binormal model may be distorted. The authors conducted a Monte Carlo simulation study to compare the bias and sampling variability in the estimates of the AUCs derived from parametric and nonparametric procedures. Each approach was assessed in data sets generated from various configurations of pairs of overlapping distributions; these included the binormal model and non-binormal pairs of distributions where one or both pair members were mixtures of Gaussian (MG) distributions with different degrees of departures from binormality. The biases in the estimates of the AUCs were found to be very small for both parametric and nonparametrlc procedures. The two approaches yielded very close estimates of the AUCs and of the corresponding sampling variability even when data were generated from non-binormal models. Thus, for a wide range of distributions, concern about bias or imprecision of the estimates of the AUC should not be a major factor in choosing between the nonparametric and parametric approaches. Key words: ROC analysis; quantitative diagnostic test; comparison, parametric; binormal model; LABROC; nonparametric procedure; area under the curve (AUC). M e d Decis Making 1997;17:94-102) During the past ten years, receiver operator characteristic (ROC) analysis has become a popular method for evaluating the accuracy/performance of medical diagnostic tests. 1-3 The most attractive property of ROC analysis is that the accuracy indices derived from this technique are not distorted by fluctuations caused by the use of an arbitrarily chosen decision "criterion" or "cutoff." 4-8 One index available from an ROC analysis, the area under the curve"' (AUC), measures the ability of a diagnostic

Evaluation of strategies to combine multiple biomarkers in diagnostic testing

2012

A challenge in clinical medicine is that of correct diagnosis of disease. Medical researchers invest considerable time and effort to enhance accurate disease diagnosis. Diagnostic tests are important components in modern medical practice. The receiver operating characteristic (ROC) is a commonly used statistical tool for describing the discriminatory accuracy and performance of a diagnostic test. A popular summary index of discriminatory accuracy is the area under ROC curve (AUC). First of all, I thank ALLAH for his Grace and Mercy showered upon me. I heartily express my profound gratitude to my supervisors, Professor Henry G. Mwambi and Dr Lori E. Dodd, for their invaluable learned guidance, advises, encouragement, understanding and continued support they have provided me throughout the duration of my studies which led to the compilation of this thesis. I will be always indebted to them for introducing me to this fascinating area of application in health research and creating my interest in Biostatistics. I lovingly thank my dear husband Ayoub, who supported me each step of the way and without his help and encouragement it simply never would have been possible to finish this work. I also would like to thank my lovely parents Hanan and Balla for their continuous support and best wishes. I am grateful for the facilities made available to me by the School of Mathematics, Statistics and Computer Science of the University of KwaZulu-Natal (UKZN), Pietermaritzburg. I am also grateful for the financial support that I have received from UKZN. My thanks extend to Professor

Generalized ROC curve inference for a biomarker subject to a limit of detection and measurement error

Statistics in Medicine, 2009

The receiver operating characteristic (ROC) curve is a tool commonly used to evaluate biomarker utility in clinical diagnosis of disease, especially during biomarker development research. Emerging biomarkers are often measured with random measurement error and subject to limits of detection that hinder their potential utility or mask an ability to discriminate by negatively biasing the estimates of ROC curves and subsequent area under the curve. Methods have been developed to correct the ROC curve for each of these types of sources of bias but here we develop a method by which the ROC curve is corrected for both simultaneously through replicate measures and maximum likelihood. Our method is evaluated via simulation study and applied to two potential discriminators of women with and without preeclampsia.

Biomarker selection for medical diagnosis using the partial area under the ROC curve

BMC Research Notes, 2014

Background: A biomarker is usually used as a diagnostic or assessment tool in medical research. Finding an ideal biomarker is not easy and combining multiple biomarkers provides a promising alternative. Moreover, some biomarkers based on the optimal linear combination do not have enough discriminatory power. As a result, the aim of this study was to find the significant biomarkers based on the optimal linear combination maximizing the pAUC for assessment of the biomarkers.

Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation

Caspian journal of internal medicine, 2013

This review provides the basic principle and rational for ROC analysis of rating and continuous diagnostic test results versus a gold standard. Derived indexes of accuracy, in particular area under the curve (AUC) has a meaningful interpretation for disease classification from healthy subjects. The methods of estimate of AUC and its testing in single diagnostic test and also comparative studies, the advantage of ROC curve to determine the optimal cut off values and the issues of bias and confounding have been discussed.

COMPARING SEVERAL DIAGNOSTIC PROCEDURES USING THE INTRINSIC MEASURES OF ROC CURVE

Keywords: Diagnostic Procedure; ROC curve; AUC; Sensitivity Comparison of diagnostic tests is essential in medicine. Test procedures for comparing two or more ROC curves are all based on measures d ' , AUC and the maximum likelihood estimates of binormal ROC curves. However, intrinsic measures such as sensitivity and specificity also play a pivotal role in assessing the performance of several diagnostic procedures. In this paper, a new methodology is proposed in order to compare several diagnostic procedures using the intrinsic measures of ROC curve

A non-inferiority test for diagnostic accuracy based on the paired partial areas under ROC curves

Statistics in Medicine, 2008

Non-inferiority is a reasonable approach to assessing the diagnostic accuracy of a new diagnostic test if it provides an easier administration or reduces the cost. The area under the receiver operating characteristic (ROC) curve is one of the common measures for the overall diagnostic accuracy. However, it may not differentiate the various shapes of the ROC curves with different diagnostic significances. The partial area under the ROC curve (PAUROC) may present an alternative that can provide additional and complimentary information for some diagnostic tests which require false-positive rate that does not exceed a certain level. Non-parametric and maximum likelihood methods can be used for the non-inferiority tests based on the difference in paired PAUROCs. However, their performance has not been investigated in finite samples. We propose to use the concept of generalized p-value to construct a non-inferiority test for diagnostic accuracy based on the difference in paired PAUROCs. Simulation results show that the proposed non-inferiority test not only adequately controls the size at the nominal level but also is uniformly more powerful than the non-parametric methods. The proposed method is illustrated with a numerical example using published data.

On the assessment of the added value of new predictive biomarkers

BMC Medical Research Methodology, 2013

Background: The surge in biomarker development calls for research on statistical evaluation methodology to rigorously assess emerging biomarkers and classification models. Recently, several authors reported the puzzling observation that, in assessing the added value of new biomarkers to existing ones in a logistic regression model, statistical significance of new predictor variables does not necessarily translate into a statistically significant increase in the area under the ROC curve (AUC). Vickers et al. concluded that this inconsistency is because AUC "has vastly inferior statistical properties," i.e., it is extremely conservative. This statement is based on simulations that misuse the DeLong et al. method. Our purpose is to provide a fair comparison of the likelihood ratio (LR) test and the Wald test versus diagnostic accuracy (AUC) tests. Discussion: We present a test to compare ideal AUCs of nested linear discriminant functions via an F test. We compare it with the LR test and the Wald test for the logistic regression model. The null hypotheses of these three tests are equivalent; however, the F test is an exact test whereas the LR test and the Wald test are asymptotic tests. Our simulation shows that the F test has the nominal type I error even with a small sample size. Our results also indicate that the LR test and the Wald test have inflated type I errors when the sample size is small, while the type I error converges to the nominal value asymptotically with increasing sample size as expected. We further show that the DeLong et al. method tests a different hypothesis and has the nominal type I error when it is used within its designed scope. Finally, we summarize the pros and cons of all four methods we consider in this paper. Summary: We show that there is nothing inherently less powerful or disagreeable about ROC analysis for showing the usefulness of new biomarkers or characterizing the performance of classification models. Each statistical method for assessing biomarkers and classification models has its own strengths and weaknesses. Investigators need to choose methods based on the assessment purpose, the biomarker development phase at which the assessment is being performed, the available patient data, and the validity of assumptions behind the methodologies.

Evaluating diagnostic tests: The area under the ROC curve and the balance of errors

Statistics in Medicine, 2010

Because accurate diagnosis lies at the heart of medicine, it is important to be able to evaluate the effectiveness of diagnostic tests. A variety of accuracy measures are used. One particularly widely used measure is the AUC, the area under the Receiver Operating Characteristic (ROC) curve. This measure has a well-understood weakness when comparing ROC curves which cross. However, it also has the more fundamental weakness of failing to balance different kinds of misdiagnosis effectively. This is not merely an aspect of the inevitable arbitrariness in choosing a performance measure, but is a core property of the way the AUC is defined. This property is explored, and an alternative, the H measure, is described.

Analysis of biomarker data: logs, odds ratios, and receiver operating characteristic curves

Current Opinion in HIV and AIDS, 2010

Purpose of review-We discuss two data analysis issues for studies that use binary clinical outcomes (whether or not an event occurred): the choice of an appropriate scale and transformation when biomarkers are evaluated as explanatory factors in logistic regression; and assessing the ability of biomarkers to improve prediction accuracy for event risk.