Semiparametric GEE analysis in partially linear single-index models for longitudinal data (original) (raw)

GEE ANALYSIS IN PARTIALLY LINEAR SINGLE-INDEX MODELS FOR LONGITUDINAL DATA By

2018

In this article, we study a partially linear single-index model for longitudinal data under a general framework which includes both the sparse and dense longitudinal data cases. A semiparametric estimation method based on a combination of the local linear smoothing and generalized estimation equations (GEE) is introduced to estimate the two parameter vectors as well as the unknown link function. Under some mild conditions, we derive the asymptotic properties of the proposed parametric and nonparametric estimators in different scenarios, from which we find that the convergence rates and asymptotic variances of the proposed estimators for sparse longitudinal data would be substantially different from those for dense longitudinal data. We also discuss the estimation of the covariance (or weight) matrices involved in the semiparametric GEE method. Furthermore, we provide some numerical studies including Monte Carlo simulation and an empirical application to illustrate our methodology an...

Semiparametric Estimation of Covariance Matrixes for Longitudinal Data

Journal of The American Statistical Association, 2008

Estimation of longitudinal data covariance structure poses significant challenges because the data are usually collected at irregular time points. A viable semiparametric model for covariance matrices was proposed in Fan, that allows one to estimate the variance function nonparametrically and to estimate the correlation function parametrically via aggregating information from irregular and sparse data points within each subject. However, the asymptotic properties of their quasi-maximum likelihood estimator (QMLE) of parameters in the covariance model are largely unknown. In the current work, we address this problem in the context of more general models for the conditional mean function including parametric, nonparametric, or semiparametric. We also consider the possibility of rough mean regression function and introduce the difference-based method to reduce biases in the context of varying-coefficient partially linear mean regression models. This provides a more robust estimator of the covariance function under a wider range of situations. Under some technical conditions, consistency and asymptotic normality are obtained for the QMLE of the parameters in the correlation function. Simulation studies and a real data example are used to illustrate the proposed approach.

New Estimation and Model Selection Procedures for Semiparametric Modeling in Longitudinal Data Analysis

Journal of the American Statistical Association, 2004

Semiparametric regression models are very useful for longitudinal data analysis. The complexity of semiparametric models and the structure of longitudinal data pose new challenges to parametric inferences and model selection that frequently arise from longitudinal data analysis. In this article, two new approaches are proposed for estimating the regression coefficients in a semiparametric model. The asymptotic normality of the resulting estimators is established. An innovative class of variable selection procedures is proposed to select significant variables in the semiparametric models. The proposed procedures are distinguished from others in that they simultaneously select significant variables and estimate unknown parameters. Rates of convergence of the resulting estimators are established. With a proper choice of regularization parameters and penalty functions, the proposed variable selection procedures are shown to perform as well as an oracle estimator. A robust standard error formula is derived using a sandwich formula and is empirically tested. Local polynomial regression techniques are used to estimate the baseline function in the semiparametric model.

Efficient Semiparametric Marginal Estimation for Longitudinal/Clustered Data

Journal of the American Statistical Association, 2005

We consider marginal generalized semiparametric partially linear models for clustered data. Lin and Carroll derived the semiparametric efficient score function for this problem in the multivariate Gaussian case, but they were unable to construct a semiparametric efficient estimator that ...

On The Estimation of a Semiparametric Generalized Linear Model BY

2015

In this article, estimation methods of the semiparametric generalized linear model known as the generalized partial linear model (GPLM) are reviewed. These methods are based on using kernel smoothing functions in the estimation of the nonparametric component of the model. We derive the algorithms for the estimation process and develop these algorithms for the generalized partial linear model (GPLM) with a binary response.

A New Mixed Estimator in Nonparametric Regression for Longitudinal Data

Journal of Mathematics

We introduce a new method for estimating the nonparametric regression curve for longitudinal data. This method combines two estimators: truncated spline and Fourier series. This estimation is completed by minimizing the penalized weighted least squares and weighted least squares. This paper also provides the properties of the new mixed estimator, which are biased and linear in the observations. The best model is selected using the smallest value of generalized cross-validation. The performance of the new method is demonstrated by a simulation study with a variety of time points. Then, the proposed approach is applied to a stroke patient dataset. The results show that simulated data and real data yield consistent findings.

On The Estimation of a Semiparametric Generalized Linear Model

2000

In this article, estimation methods of the semiparametric generalized linear model known as the generalized partial linear model (GPLM) are reviewed. These methods are based on using kernel smoothing functions in the estimation of the nonparametric component of the model. We derive the algorithms for the estimation process and develop these algorithms for the generalized partial linear model (GPLM) with

Nonparametric Regression of Covariance Structures in Longitudinal Studies

2009

In this paper we propose a nonparametric data-driven approach to model covariance structures for longitudinal data. Based on a modified Cholesky decomposition, the within-subject covariance matrix is decomposed into a unit lower triangular matrix involving generalized autoregressive coefficients and a diagonal matrix involving innovation variances. Local polynomial smoothing estimation is proposed to model the nonparametric smoothing functions of the mean, generalized autoregressive coefficients and (log) innovation variances, simultaneously. We provide theoretical justification of consistency of the fitted smoothing curves in the mean, generalized autoregressive parameters and (log) innovation variances. Two real data sets are analyzed for illustration. Simulation studies are made to evaluate the efficacy of the proposed method.

Selection of covariance patterns for longitudinal data in semi-parametric models

Statistical Methods in Medical Research, 2010

The use of patterned covariance structures in the parametric analysis of longitudinal data is both elegant and efficient. However, this strategy has not been well studied for semi-parametric models for analysing such data. We propose to estimate the covariance matrix in the semi-parametric model by rearranging the non-parametric component as a profiled linear function of the data and using a local smoothing technique. This results in a parametric regression formulation that enables us to construct likelihood functions and use various information criteria to select the best fitting covariance matrix. We apply our method to reanalyse data from a two-armed clinical trial for Scleroderma patients and show our method is more efficient for estimating the parametric components in the semi-parametric model.

Efficient parameter estimation in longitudinal data analysis using a hybrid GEE method

Biostatistics, 2009

The method of generalized estimating equations (GEEs) provides consistent estimates of the regression parameters in a marginal regression model for longitudinal data, even when the working correlation model is misspecified (Liang and Zeger, 1986). However, the efficiency of a GEE estimate can be seriously affected by the choice of the working correlation model. This study addresses this problem by proposing a hybrid method that combines multiple GEEs based on different working correlation models, using the empirical likelihood method (Qin and Lawless, 1994). Analyses show that this hybrid method is more efficient than a GEE using a misspecified working correlation model. Furthermore, if one of the working correlation structures correctly models the within-subject correlations, then this hybrid method provides the most efficient parameter estimates. In simulations, the hybrid method's finite-sample performance is superior to a GEE under any of the commonly used working correlation models and is almost fully efficient in all scenarios studied. The hybrid method is illustrated using data from a longitudinal study of the respiratory infection rates in 275 Indonesian children.