Dear Allstaters,

A colleague and I are trying to model motor vehicle-related injury event
rates in a cohort
study using Poisson regression (PROC GENMOD in SAS).  When individual data
from all 10,525 participants (145 events) are used, the models appear
severely underdispersed, e.g. deviance/degrees of freedom = 0.14.  Despite
this, the effect estimates and confidence intervals produced by this method
are very similar to those produced by Cox regression.  Rescaling using the
DSCALE option in GENMOD produces extremely narrow - and probably
implausible - confidence intervals.  The degrees of freedom (about 10,000)
and plots of the deviance residuals suggest that the model is trying to fit
all of the thousands of zero counts for participants who didn't have events.

To eliminate the excessive number of zero counts I have tried collapsing
the dataset into the smallest possible number of categories where each
category has a unique combination of covariates (using PROC SUMMARY).
Poisson regression using these data shows moderate overdipersion, e.g.
deviance/degrees of freedom = 1.5.  However, this approach doesn't
completely solve the problem because it can only be applied to models that
have no more than two or three covariates.  The reason for this is that the
number of categories increases multiplicatively as each new covariate is
added, meaning the problem with multiple zero counts soon reappears.

Could the apparent underdispersion of the Poisson models using individual
participant data merely be artefactual?   If so, is there a way around the
problem that would permit inclusion of more than two or three covariates in
a model?

Thanks in advance for your help.

Derrick Bennett
Dr Derrick Bennett,
Clinical Trials Research Unit,
University of Auckland,
Private Bag 92 019,
New Zealand.

Ph : 64 9 373 7599 x4724
fax: 64 9 373 1710