A Bayesian approach to analyse overdispersed longitudinal count data (original) (raw)
Related papers
Computational Statistics & Data Analysis, 2013
In sets of count data, the sample variance is often considerably larger or smaller than the sample mean, known as a problem of over-or underdispersion. The focus is on hierarchical Bayesian modeling of such longitudinal count data. Two different models are considered. The first one assumes a Poisson distribution for the count data and includes a subject-specific intercept, which is assumed to follow a normal distribution, to account for subject heterogeneity. However, such a model does not fully address the potential problem of extra-Poisson dispersion. The second model, therefore, includes also random subject and time dependent parameters, assumed to be gamma distributed for reasons of conjugacy. To compare the performance of the two models, a simulation study is conducted in which the mean squared error, relative bias, and variance of the posterior means are compared.
A Bayesian model for longitudinal count data with non-ignorable dropout
Journal of the Royal Statistical Society: Series C (Applied Statistics), 2008
Asthma is an important chronic disease of childhood. An intervention programme for managing asthma was designed on principles of self-regulation and was evaluated by a randomized longitudinal study. The study focused on several outcomes, and, typically, missing data remained a pervasive problem. We develop a pattern-mixture model to evaluate the outcome of intervention on the number of hospitalizations with non-ignorable dropouts. Pattern-mixture models are not generally identifiable as no data may be available to estimate a number of model parameters. Sensitivity analyses are performed by imposing structures on the unidentified parameters. We propose a parameterization which permits sensitivity analyses on clustered longitudinal count data that have missing values due to non-ignorable missing data mechanisms. This parameterization is expressed as ratios between event rates across missing data patterns and the observed data pattern and thus measures departures from an ignorable missing data mechanism. Sensitivity analyses are performed within a Bayesian framework by averaging over different prior distributions on the event ratios. This model has the advantage of providing an intuitive and flexible framework for incorporating the uncertainty of the missing data mechanism in the final analysis.
A Bayesian nonparametric model for count functional data
2012
Count functional data arise in a variety of applications, including longitudinal, spatial and imaging studies measuring functional count responses for each subject under study. The literature on statistical models for dependent count data is dominated by models built from hierarchical Poisson components. The Poisson assumption is not warranted in many applications, and hierarchical Poisson models make restrictive assumptions about over-dispersion in marginal distributions. This article discuss a class of nonparametric Bayes count functional data models introduced in Canale and Dunson , which are constructed through rounding real-valued underlying processes. Computational algorithms are developed using Markov chain Monte Carlo and the methods are illustrated through application to asthma inhaler usage.
An extended random-effects approach to modeling repeated, overdispersed count data
Lifetime Data Analysis, 2007
Non-Gaussian outcomes are often modeled using members of the so-called exponential family. The Poisson model for count data falls within this tradition. The family in general, and the Poisson model in particular, are at the same time convenient since mathematically elegant, but in need of extension since often somewhat restrictive. Two of the main rationales for existing extensions are (1) the occurrence of overdispersion, in the sense that the variability in the data is not adequately captured by the model's prescribed mean-variance link, and (2) the accommodation of data hierarchies owing to, for example, repeatedly measuring the outcome on the same subject, recording information from various members of the same family, etc. There is a variety of overdispersion models for count data, such as, for example, the negative-binomial model. Hierarchies are often accommodated through the inclusion of subject-specific, random effects. Though not always, one conventionally assumes such random effects to be normally distributed. While both of these issues may occur simultaneously, models accommodating them at once are less than common. This paper proposes a generalized linear model, accommodating overdispersion and clustering through two separate sets of random effects, of gamma and normal type, respectively. This is in line with the proposal by Booth et al. (Stat Model 3:179-181, 2003). The model extends both classical overdispersion models for count data
Nonparametric Bayes modelling of count processes
Biometrika, 2013
Data on count processes arise in a variety of applications, including longitudinal, spatial and imaging studies measuring count responses. The literature on statistical models for dependent count data is dominated by models built from hierarchical Poisson components. The Poisson assumption is not warranted in many applications, and hierarchical Poisson models make restrictive assumptions about over-dispersion in marginal distributions. This article proposes a class of nonparametric Bayes count process models, which are constructed through rounding real-valued underlying processes. The proposed class of models accommodates applications in which one observes separate count-valued functional data for each subject under study. Theoretical results on large support and posterior consistency are established, and computational algorithms are developed using Markov chain Monte Carlo. The methods are evaluated via simulation studies and illustrated through application to longitudinal tumor counts and asthma inhaler usage.
Computational Statistics & Data Analysis, 2008
In this paper we present the results of a simulation study to explore the ability of Bayesian parametric and nonparametric models to provide an adequate fit to count data, of the type that would routinely be analyzed parametrically either through fixed-effects or random-effects Poisson models. The context of the study is a randomized controlled trial with two groups (treatment and control). Our nonparametric approach utilizes several modeling formulations based on Dirichlet process priors. We find that the nonparametric models are able to flexibly adapt to the data, to offer rich posterior inference, and to provide, in a variety of settings, more accurate predictive inference than parametric models.
A Bayesian Approach to Account for Misclassification and Overdispersion in Count Data
Count data are subject to considerable sources of what is often referred to as non-sampling error. Errors such as misclassification, measurement error and unmeasured confounding can lead to substantially biased estimators. It is strongly recommended that epidemiologists not only acknowledge these sorts of errors in data, but incorporate sensitivity analyses into part of the total data analysis. We extend previous work on Poisson regression models that allow for misclassification by thoroughly discussing the basis for the models and allowing for extra-Poisson variability in the form of random effects. Via simulation we show the improvements in inference that are brought about by accounting for both the misclassification and the overdispersion.
A novel Bayesian regression model for counts with an application to health data
Journal of Applied Statistics, 2017
Discrete data are collected in many application areas and are often characterised by highly-skewed distributions. An example of this, which is considered in this paper, is the number of visits to a specialist, often taken as a measure of demand in healthcare. A discrete Weibull regression model was recently proposed for regression problems with a discrete response and it was shown to possess desirable properties. In this paper, we propose the first Bayesian implementation of this model. We consider a general parametrization, where both parameters of the discrete Weibull distribution can be conditioned on the predictors, and show theoretically how, under a uniform non-informative prior, the posterior distribution is proper with finite moments. In addition, we consider closely the case of Laplace priors for parameter shrinkage and variable selection. Parameter estimates and their credible intervals can be readily calculated from their full posterior distribution. A simulation study and the analysis of four real datasets of medical records show promises for the wide applicability of this approach to the analysis of count data. The method is implemented in the R package BDWreg.
Hierarchical Bayesian models for multiple count data Australian
2002
Abstract: The aim of this paper is to develop a model for analyzing multiple response models for count data and that may take into account complex cor-relation structures. The model is specified hierarchically in several layers and can be used for sparse data as it is shown in the second part of the paper. It is a discrete multivariate response approach regarding the left side of models equations. Markov Chain Monte Carlo techniques are needed for extracting inferential results. The possible correlation between different counts is more general than the one used in repeated measurements or longitudinal studies framework.
The combined model: A tool for simulating correlated counts with overdispersion
Communications in Statistics - Simulation and Computation, 2014
The combined model as introduced by Molenberghs et al. (2007, 2010) has been shown to be an appealing tool for modeling not only correlated or overdispersed data but also for data that exhibit both these features. Unlike techniques available in the literature prior to the combined model, which use a single random-effects vector to capture correlation and/or overdispersion, the combined model allows for the correlation and overdispersion features to be modeled by two sets of random effects. In the context of count data, for example, the combined model naturally reduces to the Poisson-normal model, an instance of the generalized linear mixed model in the absence of overdispersion and it also reduces to the negative-binomial model in the absence of correlation. Here, a Poisson model is specified as the parent distribution of the data conditional on a normally distributed random effect at the subject or cluster level and/or a gamma distribution at observation level. Importantly, the development of the combined model and surrounding derivations have relevance well beyond mere data analysis. It so happens that the combined model can also be used to simulate correlated data. If a researcher is interested in comparing marginal models via Monte Carlo simulations, a necessity to generate suitable correlated count data arises. One option is to induce correlation via random effects but calculation of such quantities as the bias is then not straightforward. Since overdispersion and correlation are simultaneous features of longitudinal count data, the combined model presents an appealing 1