Joint Modeling of Survival and Longitudinal Data: Likelihood Approach Revisited (original) (raw)

Standard error estimation using the EM algorithm for the joint modeling of survival and longitudinal data

Biostatistics, 2014

Joint modeling of survival and longitudinal data has been studied extensively in the recent literature. The likelihood approach is one of the most popular estimation methods employed within the joint modeling framework. Typically, the parameters are estimated using maximum likelihood, with computation performed by the expectation maximization (EM) algorithm. However, one drawback of this approach is that standard error (SE) estimates are not automatically produced when using the EM algorithm. Many different procedures have been proposed to obtain the asymptotic covariance matrix for the parameters when the number of parameters is typically small. In the joint modeling context, however, there may be an infinite-dimensional parameter, the baseline hazard function, which greatly complicates the problem, so that the existing methods cannot be readily applied. The profile likelihood and the bootstrap methods overcome the difficulty to some extent; however, they can be computationally int...

Joint Modelling of Longitudinal and Survival Data: A comparison of Joint and Independent Models

2011

In recent years, the interest in longitudinal data analysis has grown rapidly through the development of new methods and the increase in computational power to aid and further develop this field of research. One such method is the joint modelling of longitudinal and survival data. It is commonly found in the collection of medical longitudinal data that both repeated measures and time-to-event data are collected. These processes are typically correlated, where both types of data are associated through unobserved random effects. Due to this association, joint models were developed to enable a more accurate method to model both processes simultaneously. When these processes are correlated, the use of independent models can cause biased estimates [3, 6, 7], with joint models resulting in a reduction in the standard error of estimates. Thus, with more accurate parameter estimates, valid inferences concerning the effect of covariates on the longitudinal and survival processes can be obtai...

Semi-Parametric Joint Modeling of Survival and Longitudinal Data: The R Package JSM

Journal of Statistical Software

This paper is devoted to the R package JSM which performs joint statistical modeling of survival and longitudinal data. In biomedical studies it has been increasingly common to collect both baseline and longitudinal covariates along with a possibly censored survival time. Instead of analyzing the survival and longitudinal outcomes separately, joint modeling approaches have attracted substantive attention in the recent literature and have been shown to correct biases from separate modeling approaches and enhance information. Most existing approaches adopt a linear mixed effects model for the longitudinal component and the Cox proportional hazards model for the survival component. We extend the Cox model to a more general class of transformation models for the survival process, where the baseline hazard function is completely unspecified leading to semiparametric survival models. We also offer a non-parametric multiplicative random effects model for the longitudinal process in JSM in addition to the linear mixed effects model. In this paper, we present the joint modeling framework that is implemented in JSM, as well as the standard error estimation methods, and illustrate the package with two real data examples: a liver cirrhosis data and a Mayo Clinic primary biliary cirrhosis data.

Comparing crossing hazard rate functions by joint modelling of survival and longitudinal data

Journal of Statistical Computation and Simulation, 2019

It is one of the important issues in survival analysis to compare two hazard rate functions to evaluate treatment effect. It is quite common that the two hazard rate functions cross each other at one or more unknown time points, representing temporal changes of the treatment effect. In certain applications, besides survival data, we also have related longitudinal data available regarding some time-dependent covariates. In such cases, a joint model that accommodates both types of data can allow us to infer the association between the survival and longitudinal data and to assess the treatment effect better. In this paper, we propose a modeling approach for comparing two crossing hazard rate functions by joint modeling survival and longitudinal data. Maximum likelihood estimation is used in estimating the parameters of the proposed joint model using the EM algorithm. Asymptotic properties of the maximum likelihood estimators are studied. To illustrate the virtues of the proposed method, we compare the performance of the proposed method with several existing methods in a simulation study. Our proposed method is also demonstrated using a real dataset obtained from an HIV clinical trial.

A modified two-stage approach for joint modelling of longitudinal and time-to-event data

Journal of Statistical Computation and Simulation, 2018

Joint models for longitudinal and time-to-event data have been applied in many different fields of statistics and clinical studies. However, the main difficulty these models have to face with is the computational problem. The requirement for numerical integration becomes severe when the dimension of random effects increases. In this paper, a modified two-stage approach has been proposed to estimate the parameters in joint models. In particular, in the first stage, the linear mixed-effects models and best linear unbiased predictorsare applied to estimate parameters in the longitudinal submodel. In the second stage, an approximation of the fully joint log-likelihood is proposed using the estimated the values of these parameters from the longitudinal submodel. Survival parameters are estimated bymaximizing the approximation of the fully joint loglikelihood. Simulation studies show that the approach performs well, especially when the dimension of random effects increases. Finally, we implement this approach on AIDS data.

Revisiting Methods For Modeling Longitudinal and Survival Data: The Framingham Heart Study

2020

Background Statistical methods for modeling longitudinal and time-to-event data has received much attention in medical research and is becoming increasingly useful. In clinical studies, such as cancer and AIDS, longitudinal biomarkers are used to monitor disease progression and to predict survival. These longitudinal measures are often missing at failure times and may be prone to measurement errors. More importantly, time-dependent survival models that include the raw longitudinal measurements may lead to biased results. In previous studies these two types of data are frequently analyzed separately where a mixed effects model is used for the longitudinal data and a survival model is applied to the event outcome. Methods In this paper we compare joint maximum likelihood methods, a two-step approach and a time dependent covariate method that link longitudinal data to survival data with emphasis on using longitudinal measures to predict survival. We apply a Bayesian semi-parametric joi...

Joint modelling of longitudinal and survival data: Incorporating delayed entry and an assessment of model misspecification

A now common goal in medical research is to investigate the inter-relationships between a repeatedly measured biomarker, measured with error, and the time to an event of interest. This form of question can be tackled with a joint longitudinal-survival model, with the most common approach combining a longitudinal mixed effects model with a proportional hazards survival model, where the models are linked through shared random effects. In this article, we look at incorporating delayed entry (left truncation), which has received relatively little attention. The extension to delayed entry requires a second set of numerical integration, beyond that required in a standard joint model. We therefore implement two sets of fully adaptive Gauss-Hermite quadrature with nested Gauss-Kronrod quadrature (to allow time-dependent association structures), conducted simultaneously, to evaluate the likelihood. We evaluate fully adaptive quadrature compared with previously proposed non-adaptive quadrature through a simulation study, showing substantial improvements, both in terms of minimising bias and reducing computation time. We further investigate, through simulation, the consequences of misspecifying the longitudinal trajectory and its impact on estimates of association. Our scenarios showed the current value association structure to be very robust, compared with the rate of change that we found to be highly sensitive showing that assuming a simpler trend when the truth is more complex can lead to substantial bias. With emphasis on flexible parametric approaches, we generalise previous models by proposing the use of polynomials or splines to capture the longitudinal trend and restricted cubic splines to model the baseline log hazard function. The methods are illustrated on a dataset of breast cancer patients, modelling mammographic density jointly with survival, where we show how to incorporate density measurements prior to the at-risk period, to make use of all the available information. User-friendly Stata software is provided.

A General Implementation of TMLE for Longitudinal Data Applied to Causal Inference in Survival Analysis

The International Journal of Biostatistics, 2012

In many randomized controlled trials the outcome of interest is a time to event, and one measures on each subject baseline covariates and time-dependent covariates until the subject either drops-out, the time to event is observed, or the end of study is reached. The goal of such a study is to assess the causal effect of the treatment on the survival curve. We present a targeted maximum likelihood estimator of the causal effect of treatment on survival fully utilizing all the available covariate information, resulting in a double robust locally efficient substitution estimator that will be consistent and asymptotically linear if either the censoring mechanism is consistently estimated, or if the maximum likelihood based estimator is already consistent. In particular, under the independent censoring assumption assumed by current methods, this TMLE is always consistent and asymptotically linear so that it provides valid confidence intervals and tests. Furthermore, we show that when both the censoring mechanism and the initial maximum likelihood based estimator are mis-specified, and thus inconsistent, the TMLE exhibits stability when inverse probability weighted estimators and double robust estimating equation based methods break down The TMLE is used to analyze the Tshepo study, a study designed to evaluate the efficacy, tolerability, and development of drug resistance of six different first-line antiretroviral therapies. Most importantly this paper presents a general algorithm that may be used to create targeted maximum likelihood estimators of a large class of parameters of interest for general longitudinal data structures.

Joint modeling of longitudinal and survival data

The joint modelling of longitudinal and survival data has received remarkable attention in the methodological literature over the past decade; however, the availability of software to implement the methods lags behind. The most common form of joint model assumes that the association between the survival and longitudinal processes are underlined by shared random effects. As a result, computationally intensive numerical integration techniques such as adaptive Gauss-Hermite quadrature are required to evaluate the likelihood. We describe a new user written command, stjm, which allows the user to jointly model a continuous longitudinal response and the time to an event of interest. We assume a linear mixed effects model for the longitudinal submodel, allowing flexibility through the use of fixed and/or random fractional polynomials of time. Four choices are available for the survival submodel; namely the exponential, Weibull or Gompertz proportional hazard models, and the flexible parametric model (stpm2). Flexible parametric models are fitted on the log cumulative hazard scale which has direct computational benefits as it avoids the use of numerical integration to evaluate the cumulative hazard. We describe the features of stjm through application to a dataset investigating the effect of serum bilirubin level on time to death from any cause, in 312 patients with primary biliary cirrhosis.

Modeling Longitudinal Data with Nonparametric Multiplicative Random Effects Jointly with Survival Data

Biometrics, 2008

In clinical studies, longitudinal biomarkers are often used to monitor disease progression and failure time. Joint modeling of longitudinal and survival data has certain advantages and has emerged as an effective way to mutually enhance information. Typically, a parametric longitudinal model is assumed to facilitate the likelihood approach. However, the choice of a proper parametric model turns out to be more elusive than models for standard longitudinal studies in which no survival endpoint occurs. In this article, we propose a nonparametric multiplicative random effects model for the longitudinal process, which has many applications and leads to a flexible yet parsimonious nonparametric random effects model. A proportional hazards model is then used to link the biomarkers and event time. We use B-splines to represent the nonparametric longitudinal process, and select the number of knots and degrees based on a version of the Akaike information criterion (AIC). Unknown model parameters are estimated through maximizing the observed joint likelihood, which is iteratively maximized by the Monte Carlo Expectation Maximization (MCEM) algorithm. Due to the simplicity of the model structure, the proposed approach has good numerical stability and compares well with the competing parametric longitudinal approaches. The new approach is illustrated with primary biliary cirrhosis (PBC) data, aiming to capture nonlinear patterns of serum bilirubin time courses and their relationship with survival time of PBC patients.