Analysis Of Overdispersed Count Data By Poisson Model (original) (raw)

Zero Inflated Poisson Regression Analysis in Maternal Death Cases on Java Island

Pattimura International Journal of Mathematics (PIJMath)

The basic regression model used to analyze the count data is the Poisson regression.. However, applying the Poisson regression model is unsuitable for excess zero data because it can cause overdispersion where the variance data is greater than its mean. One of the developments of the Poisson regression model can overcome this condition, Zero Inflated Poisson Regression (ZIP). In the health sector, the death of pregnant women on the Java island is an event that still rarely occurs and forms an excess zero data structure. However, the analysis of cases of maternal mortality using ZIP regression has never been studied in more depth. In this article, the maternal mortality cases in Java were modelled using ZIP regression to specify the variables that had a significant effect. The initial analysis results indicated the occurrence of overdispersion due to excess zero where there are 52% zero values in the data. The ZIP regression applied in this research provides enhancements to the Poiss...

Detecting overdispersion in count data: A zero-inflated Poisson regression analysis

Journal of Physics: Conference Series, 2017

This study focusing on analysing count data of butterflies communities in Jasin, Melaka. In analysing count dependent variable, the Poisson regression model has been known as a benchmark model for regression analysis. Continuing from the previous literature that used Poisson regression analysis, this study comprising the used of zero-inflated Poisson (ZIP) regression analysis to gain acute precision on analysing the count data of butterfly communities in Jasin, Melaka. On the other hands, Poisson regression should be abandoned in the favour of count data models, which are capable of taking into account the extra zeros explicitly. By far, one of the most popular models include ZIP regression model. The data of butterfly communities which had been called as the number of subjects in this study had been taken in Jasin, Melaka and consisted of 131 number of subjects visits Jasin, Melaka. Since the researchers are considering the number of subjects, this data set consists of five families of butterfly and represent the five variables involve in the analysis which are the types of subjects. Besides, the analysis of ZIP used the SAS procedure of overdispersion in analysing zeros value and the main purpose of continuing the previous study is to compare which models would be better than when exists zero values for the observation of the count data. The analysis used AIC, BIC and Voung test of 5% level significance in order to achieve the objectives. The finding indicates that there is a presence of over-dispersion in analysing zero value. The ZIP regression model is better than Poisson regression model when zero values exist.

Alternative Models in Overcoming the Problem of Overdispersion in Poisson Regression

Jurnal TAMBORA

This study aims to compare various alternative models in overcoming the problem of overdispersion in Poisson regression modeling. The comparative modeling is the Generalized Poisson model, Negative Binomial, and Generalized Negative Binomial. Modeling is applied to modeling the number of poor people in Central Java in 2021 with unemployment, HDI, and GRDP as independent variables. The results obtained by Generalized Poison are better than Negative Binomial and Generalized Negative Binomial because of the smaller AIC and BIC values ??and the larger R2. For simultaneous tests, it can be concluded that unemployment, HDI, and GRDP significantly affect the number of poor people. Only unemployment and HDI variables partially affect the number of poor people in Central Java. On the other hand, there is not enough evidence that GRDP affects some poor people. There is a need for comprehensive and relevant policies to overcome the number of poor people in an area.

Applying negative binomial regression analysis to overcome the overdispersion of Poisson regression model for malnutrition cases in Indonesia

Bulletin of Applied Mathematics and Mathematics Education

Indonesia is one of the developing countries that is struggling to eradicate malnutrition problem. Malnutrition that occurs over a long period of time can have an impact on deaths for the sufferers and decreasing human’s quality of life. This study aims to model the case of malnutrition that occurred in Indonesia Provinces during 2015, and get the main factors that cause malnutrition problem. Variables studied consists of Malnutrition (Y), Vitamin A consumption (X1), Exclusive breastfeeding (X2), Immunization (X3), Water quality (X4), Healthcare center (X5), and Poverty level (X6). Based on the Kolmogorov-Smirnov test, the results of malnutrition data in Indonesia Province in 2015 does not follow Poisson distribution because of overdispersion. The presence of overdispersion cases in the Poisson regression model will have an impact on the inappropriateness of inferences. An alternative model that can accomodate this case is negative binomial regression model. By using this model, fa...

Application of Negative Binomial Regression Analysis to Overcome the Overdispersion of Poisson Regression Model for Malnutrition Cases in Indonesia

Parameter: Journal of Statistics

Indonesia is one of the developing countries that is struggling to eradicate the malnutrition problem. Malnutrition that occurs over a long period of time can have an impact on the deaths of sufferers and decrease human quality of life. This study aims to model the case of malnutrition that occurred in Indonesia Provinces during 2015 and get the main factors that cause the malnutrition problem. Variables studied consist of Malnutrition (Y), Vitamin A consumption (X1), Exclusive breastfeeding (X2), Immunization (X3), Water quality (X4), Healthcare center (X5), and Poverty level (X6). Based on the Kolmogorov-Smirnov test, the results of malnutrition data in Indonesia Province in 2015 do not follow Poisson distribution because of overdispersion. The presence of overdispersion cases in the Poisson regression model will have an impact on the inappropriateness of inferences. An alternative model that accommodates this case is the negative binomial regression model. By using this model, f...

Poisson Regression Modeling Generalized in Maternal Mortality Cases in Aceh Tamiang Regency

BAREKENG: Jurnal Ilmu Matematika dan Terapan

Maternal Mortality Rate (MMR) is the number of maternal deaths due to the process of pregnancy, childbirth, and postpartum which is used as an indicator of women's health degrees. The number of maternal deaths in Aceh Tamiang Regency in 2021 is a discrete random variable distributed by Poisson. The purpose of this study is to find out what poisson regression model is generalized in the case of MMR in Aceh Tamiang Regency in 2021 and what factors affect the AKI in Aceh Tamiang Regency in 2021. The research data was obtained from the Aceh Tamiang District Health Office. This type of research is quantitative by using the Generalized Poisson Regression method. The data used are maternal mortality rates and data on factors affecting MMR in Aceh Tamiang Regency in 2021. Influencing factors are the percentage of visits by pregnant women in K1 , percentage of visits by pregnant women K4 , percentage of maternity assistance by health workers , TT immunization of pregnant women , pregnant...

Poisson Regression Models for Count Data: Use in the Number of Deaths in the Santo Angelo (Brazil

When speaking about data, presuppose its good quality otherwise the accuracy of information would be affected, which would lead to false interpretations. In Health Statistics data is obtained through surveys presented in its simplest expression, taking advantage of existing records; making an inquiry or by means of experiments. The rational organization of the data allows characterizing the priority issues and thus establishing health programs. To analyze the mortality data it is necessary to consider the mortality rate of certain age groups, so that we can find data which shows the prevalence of major groups of deaths. The analysis of data is followed by subsequent formulation of the Poisson regression models, where each group in question by age group is represented by a number of counting time. The Poisson regression model is a specific type of Generalized Linear Models (GLM) and non-linear. As [1], its main features are: a) to provide, in general, a satisfactory description of experimental data whose variance is proportional to the mean. b) It can be deduced theoretically from the first principles with a minimum of restrictions c) If events occur independently and randomly in time with constant average rate of occurrence, the model determines the number of time specified. At the end of this study, it could be seen through the analysis of the data that the age group from 70 to 79 years old sustains the highest incidence of deaths with 21.1%. Then comes the range of 60 to 69 years old with the morality rate of 20%. This was recorded for the time worked in January 2000 to December 2004. The death rate was 52.27and variance was equal to 102.43 in the city of Santo Angelo (Brazil). It was further found that the data analyzed over dispersion variance greater than average. AS a result it was necessary to remove the over dispersion to find the appropriate template. With the pattern found, some short-term forecasts were made.

Analyzing Overdispersed Antenatal Care Count Data in Bangladesh: Mixed Poisson Regression with Individual-Level Random Effects

Austrian Journal of Statistics, 2021

Poisson regression (PR) is commonly used as the base model for analyzing count data with the restrictive equidispersion property. However, overdispersed nature of count data is very common in health sciences. In such cases, PR produces misleading inferences and hence give incorrect interpretations of the results. Mixed Poisson regression with individual--level random effects (MPR_ILRE) is a further improvement for analyzing such data. We compare MPR_ILRE with PR, quasi-Poisson regression (Q_PR) and negative binomial regression (NBR) for modelling overdispersed antenatal care (ANC) count data extracted from the latest Bangladesh Demographic and Health Survey (BDHS) 2014. MPR_ILRE is found to be the best choice because of its minimum Akaike information criterion (AIC) value and the overdispersion exists in data has also been modelled very well. Study findings reveal that on average, women attended less than three ANC visits and only 6.5\% women received the World Health Organization (...

Comparison between Poisson, Quasi-Poisson, and negative binomial regression in analyzing under-five children malnutrition cases in East Java

INTERNATIONAL CONFERENCE ON STATISTICS AND DATA SCIENCE 2021

The most straightforward counted data regression is Poisson regression. The problem often discovered in Poisson regression is overdispersion. Some alternatives regression that can be used in an overdispersed counted data are quasi-Poisson and negative binomial regression. This study will identify the most appropriate and suitable regression in modelling the number of under-five children malnutrition cases in East Java as an overdispersed counted data. The data was obtained from 2018 th East Java Health Profile Book. Comparison between Poisson, quasi-Poisson, and negative binomial regression will be made based on a prediction plot, a mean-variance plot, and a comparison plot of observation weight in IWLS algorithm. The comparison shows that quasi-Poisson regression is more suitable for modeling the number of under-five children malnutrition cases in East Java. Hypothesis testing result in 10% significance level shows that the percentage of under-five children who receive exclusive breastfeeding, the percentage of under-five children who receive health services at least 8 times, and the percentage of population with proper sanitation access are factors that significantly affect the number of under-five children malnutrition cases in East Java. Based on the three significant factors, 37 regions in East Java later clustered into three clusters with their characteristics.