MODELLING LONGITUDINAL COUNT DATA WITH NON-INGNORABLE DROPOUTS. AN APPLICATION TO CD4+ COUNT IN HIV-INFECTED PATIENTS (original) (raw)

Handling drop-out in longitudinal studies

Statistics in Medicine, 2004

Drop-out is a prevalent complication in the analysis of data from longitudinal studies, and remains an active area of research for statisticians and other quantitative methodologists. This tutorial is designed to synthesize and illustrate the broad array of techniques that are used to address outcome-related dropout, with emphasis on regression-based methods. We begin with a review of important assumptions underlying likelihood-based and semi-parametric models, followed by an overview of models and methods used to draw inferences from incomplete longitudinal data. The majority of the tutorial is devoted to detailed analysis of two studies with substantial rates of drop-out, designed to illustrate the use of e ective methods that are relatively easy to apply: in the ÿrst example, we use both semi-parametric and fully parametric models to analyse repeated binary responses from a clinical trial of smoking cessation interventions; in the second, pattern mixture models are used to analyse longitudinal CD4 counts from an observational cohort study of HIV-infected women. In each example, we describe exploratory analyses, model formulation, estimation methodology and interpretation of results. Analyses of incomplete data requires making unveriÿable assumptions, and these are discussed in detail within the context of each application. Relevant SAS code is provided.

Accounting for dropout reason in longitudinal studies with nonignorable dropout

Statistical methods in medical research, 2015

Dropout is a common problem in longitudinal cohort studies and clinical trials, often raising concerns of nonignorable dropout. Selection, frailty, and mixture models have been proposed to account for potentially nonignorable missingness by relating the longitudinal outcome to time of dropout. In addition, many longitudinal studies encounter multiple types of missing data or reasons for dropout, such as loss to follow-up, disease progression, treatment modifications and death. When clinically distinct dropout reasons are present, it may be preferable to control for both dropout reason and time to gain additional clinical insights. This may be especially interesting when the dropout reason and dropout times differ by the primary exposure variable. We extend a semi-parametric varying-coefficient method for nonignorable dropout to accommodate dropout reason. We apply our method to untreated HIV-infected subjects recruited to the Acute Infection and Early Disease Research Program HIV co...

Impact of missing data due to drop-outs on estimators for rates of change in longitudinal studies: a simulation study

Statistics in Medicine, 2001

Many cohort studies and clinical trials are designed to compare rates of change over time in one or more disease markers in several groups. One major problem in such longitudinal studies is missing data due to patient drop-out. The bias and e ciency of six di erent methods to estimate rates of changes in longitudinal studies with incomplete observations were compared: generalized estimating equation estimates (GEE) proposed by ; unweighted average of ordinary least squares (OLSE) of individual rates of change (UWLS); weighted average of OLSE (WLS); conditional linear model estimates (CLE), a covariate type estimates proposed by ; random e ect (RE), and joint multivariate RE (JMRE) estimates. The latter method combines a linear RE model for the underlying pattern of the marker with a log-normal survival model for informative drop-out process. The performance of these methods in the presence of missing data completely at random (MCAR), at random (MAR) and non-ignorable (NIM) were compared in simulation studies. Data for the disease marker were generated under the linear random e ects model with parameter values derived from realistic examples in HIV infection. Rates of drop-out, assumed to increase over time, were allowed to be independent of marker values or to depend either only on previous marker values or on both previous and current marker values. Under MACR all six methods yielded unbiased estimates of both group mean rates and between-group di erence. However, the cross-sectional view of the data in the GEE method resulted in seriously biased estimates under MAR and NIM drop-out process. The bias in the estimates ranged from 30 per cent to 50 per cent. The degree of bias in the GEE estimates increases with the severity of non-randomness and with the proportion of MAR data. Under MCAR and MAR all the other ÿve methods performed relatively well. RE and JMRE estimates were more e cient (that is, had smaller variance) than UWLS, WLS and CL estimates. Under NIM, WLS and particularly RE estimates tended to underestimate the average rate of marker change (bias ≈ 10 per cent). Under NIM, UWLS, CL and JMRE performed better in terms of bias (3-5 per cent) with the JMRE giving the most e cient estimates. Given that markers are key variables related to disease progression, missing marker data are likely to be at least MAR. Thus, the GEE method may not be appropriate for analysing

Missing Covariates in Longitudinal Data with Informative Dropouts: Bias Analysis and Inference

Biometrics, 2005

We consider estimation in generalized linear mixed models (GLMM) for longitudinal data with informative dropouts. At the time a unit drops out, time-varying covariates are often unobserved in addition to the missing outcome. However, existing informative dropout models typically require covariates to be completely observed. This assumption is not realistic in the presence of time-varying covariates. In this article, we first study the asymptotic bias that would result from applying existing methods, where missing time-varying covariates are handled using naive approaches, which include: (1) using only baseline values;

Joint modeling of longitudinal data and informative dropout time in the presence of multiple changepoints

Statistics in Medicine, 2011

In longitudinal studies of patients with the human immunodeficiency virus (HIV), objectives of interest often include modeling of individual-level trajectories of HIV ribonucleic acid (RNA) as a function of time. Such models can be used to predict the effects of different treatment regimens or to classify subjects into subgroups with similar trajectories. Empirical evidence, however, suggests that individual trajectories often possess multiple points of rapid change, which may vary from subject to subject. Additionally, some individuals may end up dropping out of the study and the tendency to drop out may be related to the level of the biomarker. Modeling of individual viral RNA profiles is challenging in the presence of these changes, and currently available methods do not address all the issues such as multiple changes, informative dropout, clustering, etc. in a single model.

A Bayesian model for longitudinal count data with non-ignorable dropout

Journal of the Royal Statistical Society: Series C (Applied Statistics), 2008

Asthma is an important chronic disease of childhood. An intervention programme for managing asthma was designed on principles of self-regulation and was evaluated by a randomized longitudinal study. The study focused on several outcomes, and, typically, missing data remained a pervasive problem. We develop a pattern-mixture model to evaluate the outcome of intervention on the number of hospitalizations with non-ignorable dropouts. Pattern-mixture models are not generally identifiable as no data may be available to estimate a number of model parameters. Sensitivity analyses are performed by imposing structures on the unidentified parameters. We propose a parameterization which permits sensitivity analyses on clustered longitudinal count data that have missing values due to non-ignorable missing data mechanisms. This parameterization is expressed as ratios between event rates across missing data patterns and the observed data pattern and thus measures departures from an ignorable missing data mechanism. Sensitivity analyses are performed within a Bayesian framework by averaging over different prior distributions on the event ratios. This model has the advantage of providing an intuitive and flexible framework for incorporating the uncertainty of the missing data mechanism in the final analysis.

The Practical Use of Different Strategies to Handle Dropout in Longitudinal Studies

Therapeutic Innovation & Regulatory Science, 2001

In the presence of dropout, valid statistical inferences based on longitudinal data can, in general, only be obtained from modeling the measurement process and the dropout process simultaneously. Many models have been proposed in the statistical literature, most of which have been formulated within the framework of selection models or patternmixture models. In this paper, we will use continuous data from a longitudinal clinical trial with a 24% dropout rate to illustrate some of the models frequently used in practice. We emphasize the underlying implicit assumptions made by the different approaches, and the sensitivity of the results with respect to these assumptions. The merits and drawbacks of the procedures are extensively discussed and compared from a practical point of view.

A sensitivity approach to modeling longitudinal bivariate ordered data subject to informative dropouts

Health Services and Outcomes Research Methodology, 2006

Incomplete data abound in epidemiological and clinical studies. When the missing data process is not properly investigated, inferences may be misleading. An increasing number of models that incorporate nonrandom incomplete data have become available. At the same time, however, serious doubts have arisen about the validity of these models, known to rely on strong and unverifiable assumptions. A common conclusion emerging from the current literature is the clear need for a sensitivity analysis. We propose in this paper a detailed sensitivity analysis using graphical and analytical techniques to understand the impact of missing-data assumptions on inferences. Specifically, we explore the influence of perturbing a missing at random model locally in the direction of non-random dropout models. Data from a psychiatric trial are used to illustrate the methodology. Keywords Bivariate ordinal outcomes AE Latent variable AE Maximum marginal likelihood AE Non-ignorable dropout AE Non-parametric density estimation AE Repeated measures AE Sensitivity analysis AE Shared random effects AE Threshold crossing model 1 Introduction The problem of missing data and specifically that of dropouts is common throughout statistical work and is almost ever present in the analysis of longitudinal data. In

Shared parameter and copula models for analysis of semicontinuous longitudinal data with nonrandom dropout and informative censoring

Statistical Methods in Medical Research, 2021

Analysis of longitudinal semicontinuous data characterized by subjects’ attrition triggered by nonrandom dropout is complex and requires accounting for the within-subject correlation, and modeling of the dropout process. While methods that address the within-subject correlation and missing data are available, approaches that incorporate the nonrandom dropout, also referred to informative right censoring, in the modeling step are scarce due to the computational intensity and possible intractable integration needed for its implementation. Appreciating the complexity of this problem and the need for a new methodology that is feasible for implementation, we propose to extend a framework of likelihood-based marginalized two-part models to account for informative right censoring. The censoring process is modeled using two approaches: (1) Poisson censoring for the count of visits before dropout and (2) survival time to dropout. Novel consideration was given to the proposed joint modeling a...