A comparison of imputation methods in a longitudinal randomized clinical trial (original) (raw)

Missing data in longitudinal studies: Comparison of multiple imputation methods in a real clinical setting

Journal of Evaluation in Clinical Practice, 2020

Rationale, aims and objectives. Missing data represent a challenge in longitudinal studies. The aim of the study is to compare the performance of the multivariate normal imputation and the fully conditional specification methods, using real dataset with missing data partially completed two years later. Method. The data used came from an ongoing randomized controlled trial with five-year follow-up. At a certain time, we observed a number of patients with missing data and a number of patients whose data were unobserved because they were not yet eligible for a given follow-up. Both unobserved and missing data were imputed. The imputed unobserved data were compared with the corresponding real information obtained two years later. Results. Both imputation methods showed similar performance on the accuracy measures and produced minimally biased estimates. Conclusion. Despite the large number of repeated measures with intermittent missing data and the non-normal multivariate distribution of data, both methods performed well and was not possible to determine which was better.

Estimating treatment effects from longitudinal clinical trial data with missing values: comparative analyses using different methods

Psychiatry Research, 2004

The selection of a method for estimating treatment effects in an intent-to-treat analysis from clinical trial data with missing values often depends on the field of practice. The last observation carried forward (LOCF) analysis assumes that the responses do not change after dropout. Such an assumption is often unrealistic. Analysis with completers only requires that missing values occur completely at random (MCAR). Ignorable maximum likelihood (IML) and multiple imputation (MI) methods require that data are missing at random (MAR). We applied these four methods to a randomized clinical trial comparing anti-depressant effects in an elderly depressed group of patients using a mixed model to describe the course of the treatment effects. Results from an explanatory approach showed a significant difference between the treatments using LOCF and IML methods. Statistical tests indicate violation of the MCAR assumption favoring the flexible IML and MI methods. IML and MI methods were repeated under the pragmatic approach, using data collected after termination of protocol treatment and compared with previously reported results using piecewise splines and rescue (treatment adjustment) pragmatic analysis. No significant treatment differences were found. We conclude that attention to the missing-data mechanism should be an integral part in analysis of clinical trial data.

Accuracy versus convenience: A simulation-based comparison of two continuous imputation models for incomplete ordinal longitudinal clinical trials data

Statistics and Its Interface, 2009

Multiple imputation has become an increasingly utilized principled tool in dealing with incomplete data in recent years, and reasons for its popularity are well documented. In this work, we compare the performances of two continuous imputation models via simulated examples that mimic the characteristics of a real data set from psychiatric research. The two imputation approaches under consideration are based on multivariate normality and linear-mixed effects models. Our research goal is oriented towards identifying the relative performances of these methods in the context of continuous as well as ordinalized versions of a clinical trials data set in a longitudinal setting. Our results appear to be only marginally different across these two methods, which motivates our recommendation that practitioners who are not computationally sophisticated enough to utilize more appropriate imputation techniques, may resort to simpler normal imputation method under ignorability when the fraction of missing information is relatively small.

On comparative performance of multiple imputation methods for moderate to large proportions of missing data in clinical trials: a simulation study

Background: Longitudinal clinical trial has measurements at successive occasions, and unavailability of patient at a scheduled visit causes missingness in expected full sequence of the measurements. Missing data is a major concern during the conduct of a clinical trial. It has been noted that missing data are not handled properly during final analysis which may considerably bias the results of analysis, reduce the power of the study and lead to invalid conclusions. A promising approach to handle this problem is to impute the missing values. Methods: Multiple imputation (MI) methods provide a useful strategy to deal with the data sets with missing values, where missing values are filled in by estimate and the resulting data sets are analyzed by complete data methods. Statistical methods to address missingness have been actively pursued in recent years. This paper has attempted to provide a description of missing data mechanism and various imputation techniques for missing data analysis in longitudinal clinical trials. Further, the appropriateness of multiple imputation methods has been discussed under moderate to large proportion of missingness in a simulated clinical trial data, by comparing the various performance measures derived through intensive simulation procedure. Results: For moderate proportion (~20 & 30%) of missingness MI-regression method scored minimum bias and MSE with increase in the sample size. However, other methods did not improve much despite increased sample size. For large proportion (50%) of missing data, MI-regression and MI-propensity score methods were close in performance but MI-regression method performed significantly well with increased number of subjects in the dataset. Conclusions: Present investigation showed that MI-regression method is most appropriate for the analysis of data in presence of missingness with discussed sample size and missingness mechanism. Overall, the study findings will help researchers having limited knowledge of statistical methodology to choose a multiple imputation method accordingly, so that achieved estimates will be more precised. Keywords: Missing data, missing mechanism, longitudinal data, multiple imputation

A review of the handling of missing longitudinal outcome data in clinical trials

Trials, 2014

The aim of this review was to establish the frequency with which trials take into account missingness, and to discover what methods trialists use for adjustment in randomised controlled trials with longitudinal measurements. Failing to address the problems that can arise from missing outcome data can result in misleading conclusions. Missing data should be addressed as a means of a sensitivity analysis of the complete case analysis results. One hundred publications of randomised controlled trials with longitudinal measurements were selected randomly from trial publications from the years 2005 to 2012. Information was extracted from these trials, including whether reasons for dropout were reported, what methods were used for handing the missing data, whether there was any explanation of the methods for missing data handling, and whether a statistician was involved in the analysis. The main focus of the review was on missing data post dropout rather than missing interim data. Of all the papers in the study, 9 (9%) had no missing data. More than half of the papers included in the study failed to make any attempt to explain the reasons for their choice of missing data handling method. Of the papers with clear missing data handling methods, 44 papers (50%) used adequate methods of missing data handling, whereas 30 (34%) of the papers used missing data methods which may not have been appropriate. In the remaining 17 papers (19%), it was difficult to assess the validity of the methods used. An imputation method was used in 18 papers (20%). Multiple imputation methods were introduced in 1987 and are an efficient way of accounting for missing data in general, and yet only 4 papers used these methods. Out of the 18 papers which used imputation, only 7 displayed the results as a sensitivity analysis of the complete case analysis results. 61% of the papers that used an imputation explained the reasons for their chosen method. Just under a third of the papers made no reference to reasons for missing outcome data. There was little consistency in reporting of missing data within longitudinal trials.

A Multiple-Imputation-Based Approach to Sensitivity Analyses and Effectiveness Assessments in Longitudinal Clinical Trials

Journal of Biopharmaceutical Statistics, 2014

It is important to understand the effects of a drug as actually taken (effectiveness) and when taken as directed (efficacy). The primary objective of this investigation was to assess the statistical performance of a method referred to as placebo multiple imputation (pMI) as an estimator of effectiveness and as a worst reasonable case sensitivity analysis in assessing efficacy. The pMI method assumes the statistical behavior of placebo-and drug-treated patients after drop out is the statistical behavior of placebo-treated patients. Thus, in the effectiveness context pMI assumes no pharmacological benefit of the drug after dropout. In the efficacy context pMI is a specific form of a missing not at random analysis expected to yield a conservative estimate of efficacy. In a simulation study with 18 scenarios the pMI approach generally provided unbiased estimates of effectiveness and conservative estimates of efficacy. However, the confidence interval coverage was consistently greater than the nominal coverage rate. In contrast, LOCF and BOCF were conservative in some scenarios and anti-conservative in others with respect to efficacy and effectiveness. As expected, direct likelihood (DL) and standard multiple imputation (MI) yielded unbiased estimates of efficacy and tended to overestimate effectiveness in those scenarios where a drug effect existed. However, in scenarios with no drug effect, and therefore the true values for both efficacy and effectiveness were zero, DL and MI yielded unbiased estimates of efficacy and effectiveness.

Multiple imputation validation study: addressing unmeasured survey data in a longitudinal design

BMC Medical Research Methodology, 2021

Background Questionnaires used in longitudinal studies may have questions added or removed over time for numerous reasons. Data missing completely at a follow-up survey is a unique issue for longitudinal studies. While such excluded questions lack information at one follow-up survey, they are collected at other follow-up surveys, and covariances observed at other follow-up surveys may allow for the recovery of the missing data. This study utilized data from a large longitudinal cohort study to assess the efficiency and feasibility of using multiple imputation (MI) to recover this type of information. Methods Millennium Cohort Study participants completed the 9-item Patient Health Questionnaire (PHQ) depression module at 2 time points (2004, 2007). The suicidal ideation item in the module was set to missing for the 2007 assessment. Several single-level MI models using different sets of predictors and forms of suicidal ideation were used to compare self-reported values and imputed val...

Analysis of longitudinal binary outcomes in clinical trials with low percentage of missing values

In interventional or observational longitudinal studies, the issue of missing values is one of the main concepts that should be investigated. The researcher's main concerns are the impact of missing data on the final results of the study and the appropriate methods that missing values should be handled. Regarding the role and the scale of the variable that missing values have been occurred and the structure of missing values, different methods for analysis have been presented. In this article, the impact of missing values on a binary response variable, in a longitudinal clinical trial with three follow up sessions has been investigated Propensity Score, Predictive Model Based and Mahalanobis imputation strategies with complete case and available data methods have been used for dealing with missing values in the mentioned study. Three models; Random intercept, Marginal GEE and Marginalized Random effects models were implemented to evaluate the effect of covariates. The percentage of missing responses in each of the treatment groups, throughout the course of the study, differs from 6.8 to 14.1. Although, the estimate of variance component in random intercept and marginalized random effect models were highly significant (p <0.001) the same results were obtained for the effect of independent variables on the response variable with different imputation strategies. In our study according to the low missing percentage, there were no considerable differences between different methods that were used for handling missing data.

A New Imputation Strategy for Incomplete Longitudinal Data

International journal of statistics and applications, 2013

Longitudinal studies are very common in public health and medical sciences. Missing values are not uncommon with longitudinal studies. Ignoring the missing values in the analysis of longitudinal data leads to biased estimates. Valid inference about longitudinal data must incorporate the missing data model into the analysis. Several approaches have been proposed to obtain valid inference in the presence of missing values. One of these approaches is the imputation techniques. Imputation techniques range from single imputation (the missing value is imputed by a single observation) to multip le imputation, where the missing value is imputed by a fixed number of observations. In this article we propose a new imputation strategy to handle missing values in longitudinal data. The new strategy depends on imputing the missing values with donors from the observed values. The donors represent quantiles of the observed data. This imputation strategy is applicable if the missing data mechanism is missing not at random. The proposed technique is applied to a real data set of antidepressant clinical trial. A lso, a simulation study is conducted to evaluate the proposed strategy.

Using multiple imputation to incorporate cases with missing items in a mental health services study

Health Services and Outcomes Research Methodology, 2000

When data analysis tools require that every variable be observed on each case, then missing items on a subset of variables force investigators either to leave potentially interesting variables out of analysis models or to include these variables but drop incomplete cases from the analysis. For example, in a study considered here, mental health patients were interviewed at two time points about a variety of topics that re¯ect successful adaptation to outpatient treatment, such as support from family and friends and avoidance of legal problems, although not all patients were successfully interviewed at the second time point. In a previous analysis of these data, logistic regression models were developed to relate baseline patient characteristics and recent treatment cost history to binary outcomes capturing aspects of adaptation. In these models, years of education was omitted as a covariate because it was incompletely observed at baseline. Here, we carry out analyses that include information from partially observed cases. Speci®cally, we use a multivariate model to produce multiple plausible imputed values for each missing item, and we combine results from separate logistic regression analyses on the completed data sets using the multiple imputation inference technique. Although the majority of inferences about speci®c regression coef®cients paralleled those from the original study, some differences are noted. We discuss the implications of having¯exible analysis tools for incomplete data in health services research and comment on issues related to model choice.