Detecting Overdispersion in Count Data: Comparison of Tests

Alam, Nadia

DSpace Home
→
Department of Applied Statistics
→
Master of Science ( M.S.) in Applied Statistics
→
Thesis 2015
→
View Item

Detecting Overdispersion in Count Data: Comparison of Tests

Alam, Nadia

URI: http://dspace.ewubd.edu/handle/2525/1939

Date: 2015-08-13

Abstract:

When data appear more dispersed than expected under a reference model, the situation is termed as overdispersion. In modelling a count variable in terms of some independent predictor variables, theoretically most established and the simplest available reference model is Poisson regression model. For standard Poisson regression model, variance is equal to mean and there is no extra parameter for dispersion. However, in practi- cal scenario, the estimated variance from data often exceeds the mean and the data is considered to be overdispersed. To solve the overdispersion problem, two common alternative approaches are i) tting a more general parametric distribution ii) having a di erent form mean variance relationship without fully specifying the distribution. Both approaches include parameters for overdispersion to be estimated from data. However, when there is no overdispersion, Poisson regression model is preferred for its simplicity, interpretability and theoretical basis. Therefore, robust test for detecting the signi cance of parameter related to overdispersion is important to use before going for alterative to Poisson regression. In this work, we have investigated tests for detecting overdispersion when Poisson model is used for count data. The tests discussed are derived from partial score and are applica- ble against negative binomial or more generally mixed Poisson alternatives. These tests do not require tting alternative models that incorporate overdispersion to check the ab- sence of overdispersion. Only Poisson model is needed to be tted. Four test statistics are illustrated with their distributional approximations for computing signi cance level. The test statistics have been analyzed and compared based on the assumptions on de- riving the statistics, their limiting distributions and applicability for di erent number of observation in sample. A simulation study was done to check adequacy of distributional assumption for three of them who follow approximately normal distribution. The study involved generating samples of the statistics and proportion of the time each exceeded the standard normal upper 20%, 10%, 5%, and 1% point were tabulated. From the results, the normality assumption of one of the statistics has been observed to be good for large sample size but less accurate for small size. Another one of the statistics has been found to have almost accurate standard normal distribution even for small sample. Some comparisons and recommendations relating to the applicability and assumptions of the statistics are also presented.

Description:

This thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Applied Statistics of East West University, Dhaka, Bangladesh

Show full item record