Generalized fiducial inference on the mean of zero-inflated Poisson and Poisson hurdle models (original) (raw)

Properties of the zero-and-one inflated Poisson distribution and likelihood-based inference methods

Statistics and Its Interface, 2016

To model count data with excess zeros and excess ones, in their unpublished manuscript, Melkersson and Olsson (1999) extended the zero-inflated Poisson distribution to a zero-and-one-inflated Poisson (ZOIP) distribution. However, the distributional theory and corresponding properties of the ZOIP have not yet been explored, and likelihoodbased inference methods for parameters of interest were not well developed. In this paper, we extensively study the ZOIP distribution by first constructing five equivalent stochastic representations for the ZOIP random variable and then deriving other important distributional properties. Maximum likelihood estimates of parameters are obtained by both the Fisher scoring and expectation-maximization algorithms. Bootstrap confidence intervals for parameters of interest and testing hypotheses under large sample sizes are provided. Simulations studies are performed and five real data sets are used to illustrate the proposed methods.

Statistical model for overdispersed count outcome with many zeros : an approach for marginal inference

2016

Marginalised models are in great demand by many researchers in the life sciences, particularly in clinical trials, epidemiology, health-economics, surveys and many others, since they allow generalisation of inference to the entire population under study. For count data, standard procedures such as the Poisson regression and negative binomial model provide population average inference for model parameters. However, occurrence of excess zero counts and lack of independence in empirical data have necessitated their extension to accommodate these phenomena. These extensions, though useful, complicate interpretations of effects. For example, the zero-inflated Poisson model accounts for the presence of excess zeros, but the parameter estimates do not have a direct marginal inferential ability as the base model, the Poisson model. Marginalisations due to the presence of excess zeros are underdeveloped though demand for them is interestingly high. The aim of this paper,therefore, is to deve...

Statistical model for overdispersed count outcome with many zeros: an approach for direct marginal inference

2015

Marginalized models are in great demand by most researchers in the life sciences particularly in clinical trials, epidemiology, health-economics, surveys and many others since they allow generalization of inference to the entire population under study. For count data, standard procedures such as the Poisson regression and negative binomial model provide population average inference for model parameters. However, occurrence of excess zero counts and lack of independence in empirical data have necessitated their extension to accommodate these phenomena. These extensions, though useful, complicates interpretations of effects. For example, the zero-inflated Poisson model accounts for the presence of excess zeros but the parameter estimates do not have a direct marginal inferential ability as its base model, the Poisson model. Marginalizations due to the presence of excess zeros are underdeveloped though demand for such is interestingly high. The aim of this paper is to develop a margina...

Modern Bayesian Inference in Zero-Inflated Poisson Models

46Th Scientific Meeting of the Italian Statistical Society, 2012

The Zero-Inflated Poisson (ZIP) distribution, typically assumed for modeling count data with excess of zeros, assumes that with probability p the only possible observation is zero, and with probability 1 − p a Poisson(ψ) random variable is observed. Both the probability p and the mean ψ may depend on covariates. In this paper we discuss and apply Bayesian inference based on matching priors and on higher-order asymptotics to perform accurate inference on ψ only, even for small sample sizes.

Zero-Modified Poisson-Lindley distribution with applications in zero-inflated and zero-deflated count data

arXiv (Cornell University), 2017

The main object of this article is to present an extension of the zero-inflated Poisson-Lindley distribution, called of zero-modified Poisson-Lindley. The additional parameter π of the zero-modified Poisson-Lindley has a natural interpretation in terms of either zero-deflated/inflated proportion. Inference is dealt with by using the likelihood approach. In particular the maximum likelihood estimators of the distribution's parameter are compared in small and large samples. We also consider an alternative bias-correction mechanism based on Efron's bootstrap resampling. The model is applied to real data sets and found to perform better than other competing models.

A zero-inflated overdispersed hierarchical Poisson model

Statistical Modelling, 2014

Count data are collected repeatedly over time in many applications, such as biology, epidemiology, and public health. Such data are often characterized by the following three features. First, correlation due to the repeated measures is usually accounted for using subject-specific random effects, which are assumed to be normally distributed. Second, the sample variance may exceed the mean, and hence, the theoretical mean-variance relationship is violated, leading to overdispersion. This is usually allowed for based on a hierarchical approach, combining a Poisson model with gamma distributed random effects. Third, an excess of zeros beyond what standard count distributions can predict is often handled by either the hurdle or the zero-inflated model. A zero-inflated model assumes two processes as sources of zeros and combines a count distribution with a discrete point mass as a mixture, while the hurdle model separately handles zero observations and positive counts, where then a truncated-at-zero count distribution is used for the non-zero state. In practice, however, all these three features can appear simultaneously. Hence, a modeling framework that incorporates all three is necessary, and this presents challenges for the data analysis. Such models, when conditionally specified, will naturally have a subject-specific interpretation. However, adopting their purposefully modified marginalized versions leads to a direct marginal or population-averaged interpretation for parameter estimates of covariate effects, which is the primary interest in many applications. In this paper, we present a marginalized hurdle model and a marginalized zero-inflated model for correlated and overdispersed count data with excess zero observations and then illustrate these further with two case studies. The first dataset focuses on the Anopheles mosquito density around a hydroelectric dam, while adolescents' involvement in work, to earn money and support their families or themselves, is studied in the second example. Sub-models, which result from omitting zero-inflation and/or overdispersion features, are also considered for comparison's purpose. Analysis of the two datasets showed that accounting for the correlation, overdispersion, and excess zeros simultaneously resulted in a better fit to the data and, more importantly, that omission of any of them leads to incorrect marginal inference and erroneous conclusions about covariate effects.

The analysis of zero-inflated count data: Beyond zero-inflated Poisson regression

British Journal of Mathematical and Statistical Psychology, 2012

Infrequent count data in psychological research are commonly modelled using zeroinflated Poisson regression. This model can be viewed as a latent mixture of an "alwayszero" component and a Poisson component. Hurdle models are an alternative class of two-component models that are seldom used in psychological research, but clearly separate the zero counts and the non-zero counts by using a left-truncated count model for the latter. In this tutorial we revisit both classes of models, and discuss model comparisons and the interpretation of their parameters. As illustrated with an example from relational psychology, both types of models can easily be fitted using the R-package pscl.

Comparison of Statistical Models in Modeling Over- Dispersed Count Data with Excess Zeros

2019

Generalised Linear Models such as Poisson and Negative Binomial models have been routinely used to model count data. But, these models assumptions are violated when the data exhibits over-dispersion and zero-inflation. Over-dispersion is as a result of excess zeros in the data. For modelling data with such characteristics several extensions of Negative Binomial and Poisson models have been proposed, such as zero-inflated and Hurdles models. Our study focus is on identifying the most statistically fit model(s) which can be adopted in presence of over-dispersion and excess zeros in the count data. We simulate data-sets at varying proportions of zeros and varying proportions of dispersion then fit the data to a Poisson, Negative Binomial, Zero-inflated Poisson, Zero-inflated Negative Binomial, Hurdles Poisson and Negative Binomial Hurdles. Model selection is based on AIC, log-likelihood, Vuong statistics and Box-plots. The results obtained, suggest that Negative Binomial Hurdles performed well in most scenarios compared to other models hence, the most statistically fit model for overdispersed count data with excess zeros.

A New Regression Model for the Analysis of Overdispersed and Zero-Modified Count Data

Entropy, 2021

Count datasets are traditionally analyzed using the ordinary Poisson distribution. However, said model has its applicability limited, as it can be somewhat restrictive to handling specific data structures. In this case, the need arises for obtaining alternative models that accommodate, for example, overdispersion and zero modification (inflation/deflation at the frequency of zeros). In practical terms, these are the most prevalent structures ruling the nature of discrete phenomena nowadays. Hence, this paper’s primary goal was to jointly address these issues by deriving a fixed-effects regression model based on the hurdle version of the Poisson–Sujatha distribution. In this framework, the zero modification is incorporated by considering that a binary probability model determines which outcomes are zero-valued, and a zero-truncated process is responsible for generating positive observations. Posterior inferences for the model parameters were obtained from a fully Bayesian approach ba...