Estimating the reliability of repeatedly measured endpoints based on linear mixed-effects models. A tutorial (original) (raw)
Related papers
Controlled Clinical Trials, 2004
Repeated measures are exploited to study reliability in the context of psychiatric health sciences. It is shown how test -retest reliability can be derived using linear mixed models when the scale is continuous or quasi-continuous. The advantage of this approach is that the full modeling power of mixed models can be used. Repeated measures with a different mean structure can be used to usefully study reliability, correction for covariate effects is possible, and a complicated variance -covariance structure between measurements is allowed. In case the variance structure reduces to a random intercept (compound symmetry), classical methods are recovered. With more complex variance structures (e.g., including random slopes of time and/or serial correlation), time-dependent reliability functions are obtained. The methodology is motivated by and applied to data from five double-blind randomized clinical trials comparing the effects of risperidone to conventional antipsychotic agents for the treatment of chronic schizophrenia. Model assumptions are investigated through residual plots and by investigating the effect of influential observations. D
A Measure for the Reliability of a Rating Scale Based on Longitudinal Clinical Trial Data
Psychometrika, 2007
A new measure for reliability of a rating scale is introduced, based on the classical definition of reliability, as the ratio of the true score variance and the total variance. Clinical trial data can be employed to estimate the reliability of the scale in use, whenever repeated measurements are taken. The reliability is estimated from the covariance parameters obtained from a linear mixed model. The method provides a single number to express the reliability of the scale, but allows for the study of the reliability’s time evolution. The method is illustrated using a case study in schizophrenia.
Generalized reliability estimation using repeated measurements
British Journal of Mathematical and Statistical Psychology, 2006
Reliability can be studied in a generalized way using repeated measurements. Linear mixed models are used to derive generalized test-retest reliability measures. The method allows for repeated measures with a different mean structure due to correction for covariate effects. Furthermore, different variance-covariance structures between measurements can be implemented. When the variance structure reduces to a random intercept (compound symmetry), classical methods are recovered. With more complex variance structures (e.g. including random slopes of time and/or serial correlation), time-dependent reliability functions are obtained. The effect of time lag between measurements on reliability estimates can be evaluated. The methodology is applied to a psychiatric scale for schizophrenia.
Marginal Correlation in Longitudinal Binary Data Based on Generalized Linear Mixed Models
Communications in Statistics - Theory and Methods, 2010
This work aims at investigating marginal correlation within and between longitudinal data sequences. Useful and intuitive approximate expressions are derived based on generalized linear mixed models. Data from four double-blind randomized clinical trials are used to estimate the intra-class coefficient of reliability for a binary response. Additionally, the correlation between such a binary response and a continuous response is derived to evaluate the criterion validity of the binary response variable and the established continuous response variable.
Statistics in Medicine, 1999
Subjects often drop out of longitudinal studies prematurely, yielding unbalanced data with unequal numbers of measures for each subject. A simple and convenient approach to analysis is to develop summary measures for each individual and then regress the summary measures on between-subject covariates. We examine properties of this approach in the context of the linear mixed e!ects model when the data are not missing completely at random, in the sense that drop-out depends on the values of the repeated measures after conditioning on "xed covariates. The approach is compared with likelihood-based approaches that model the vector of repeated measures for each individual. Methods are compared by simulation for the case where repeated measures over time are linear and can be summarized by a slope and intercept for each individual. Our simulations suggest that summary measures analysis based on the slopes alone is comparable to full maximum likelihood when the data are missing completely at random but is markedly inferior when the data are not missing completely at random. Analysis discarding the incomplete cases is even worse, with large biases and very poor con"dence coverage.
A family of measures to evaluate scale reliability in a longitudinal setting
Journal of The Royal Statistical Society Series A-statistics in Society, 2009
The concept of reliability denotes one of the most important psychometric properties of a measurement scale. Reliability refers to the capacity of the scale to discriminate between subjects in a given population. In classical test theory, it is often estimated using the intraclass correlation coefficient based on two replicate measurements. However, the modelling framework used in this theory is often too narrow when applied in practical situations. Generalizability theory has extended reliability theory to a much broader a framework, but is confronted with some limitations when applied in a longitudinal setting. In this paper, we explore how the definition of reliability can be generalized to a setting where subjects are measured repeatedly over time. Based on four defining properties for the concept of reliability, we propose a family of reliability measures, which circumscribes the area in which reliability measures should be sought for. It is shown how different members assess different aspects of the problem and that the reliability of the instrument can depend on the way it is used. The methodology is motivated by and illustrated on data from a clinical study on schizophrenia. Based on this study, we estimate and compare the reliabilities of two different rating scales to evaluate the severity of the disorder.
The Estimation of Reliability in Longitudinal Models
International Journal of Behavioral Development, 1998
Despite the increasing attention devoted to the study and analysis of longitudinal data, relatively little consideration has been directed toward understanding the issues of reliability and measurement error. Perhaps one reason for this neglect has been that traditional methods of estimation (e.g. generalisability theory) require assumptions that are often not tenable in longitudinal designs. This paper first examines applications of generalisability theory to the estimation of m easurement error and reliability in longitudinal research, and notes how factors such as missing data, correlated errors, and true score instability prohibit traditional variance com ponent estimation. Next, we discuss how estimation methods using restricted maximum likelihood can account for these factors, thereby providing m any advantages over traditional estimation methods. Finally, we provide a substantive exam ple illustrating these advantages, and include brief discussions of programming and software...
The multilevel approach to repeated measures for complete and incomplete data
Quality and Quantity, 2003
Repeated measurements often are analyzed by multivariate analysis of variance (MAN-OVA). An alternative approach is provided by multilevel analysis, also called the hierarchical linear model (HLM), which makes use of random coefficient models. This paper is a tutorial which indicates that the HLM can be specified in many different ways, corresponding to different sets of assumptions about the covariance matrix of the repeated measurements. The possible assumptions range from the very restrictive compound symmetry model to the unrestricted multivariate model. Thus, the HLM can be used to steer a useful middle road between the two traditional methods for analyzing repeated measurements. Another important advantage of the multilevel approach to analyzing repeated measures is the fact that it can be easily used also if the data are incomplete. Thus it provides a way to achieve a fully multivariate analysis of repeated measures with incomplete data.
Reliability Measurements: Methods and Estimation in Healthcare Research
European Chemical Bulletin, 2023
Reliability refers to how much the test, process, or instrument produces similar results under different conditions under similar conditions. Reliability is crucial for tests intended to be stable over time. Although it's impossible to determine Reliability precisely, various techniques exist to assess it. This article focuses on methods for computing reliability in quantitative data, including ratio and interval data. The primary purpose of this paper is to discuss the idea of Reliability and present the calculation of Reliability for commonly used research instruments in simple language with examples. The article presents methods and measures of statistical Reliability. It includes Stability, internal consistency, and equivalence measurement. The authors estimated the Stability of the instrument using Karl Pearson's Coefficient of Correlation by adopting the test and retest method. Internal consistency of the instrument was estimated by Spearman-Brown Prophecy, Kuder-Richardson 20, Kuder-Richardson 21, and Cronbach's alpha formulas. The Cohen kappa correlation coefficient and Fleiss kappa correlation coefficient estimated the equivalence of the instrument. It is concluded that young and inexperienced researchers should know the significance of Reliability, its measures and how to ascertain it correctly. A greater understanding of score reliability will help them to avoid misunderstandings and write and discuss cautiously about Reliability estimates.
Reliability of a Longitudinal Sequence of Scale Ratings
Psychometrika, 2009
Reliability captures the influence of error on a measurement and, in the classical setting, is defined as one minus the ratio of the error variance to the total variance. Laenen, Alonso, and Molenberghs (Psychometrika 73:443-448, 2007) proposed an axiomatic definition of reliability and introduced the R T coefficient, a measure of reliability extending the classical approach to a more general longitudinal scenario. The R T coefficient can be interpreted as the average reliability over different time points and can also be calculated for each time point separately. In this paper, we introduce a new and complementary measure, the so-called R Λ , which implies a new way of thinking about reliability. In a longitudinal context, each measurement brings additional knowledge and leads to more reliable information. The R Λ captures this intuitive idea and expresses the reliability of the entire longitudinal sequence, in contrast to an average or occasion-specific measure. We study the measure's properties using both theoretical arguments and simulations, establish its connections with previous proposals, and elucidate its performance in a real case study.