Print

Print


Many thanks to those of you who replied with answers to my question given
below:

"I am analysing a binary outcome from a cohort study, adjusting for
continuous and categorical covariates including two stratification
variables. Given the prospective nature of the study, I would prefer to
estimate relative risks, rather than using logistic regression to obtain
odds ratios, and hence have tried Poisson and binomial (log link) modeling.
I get very similar results with both, but both show underdispersion
(defined as deviance/df) and non-normally distributed deviance residuals.

My questions are

a) whether the Poisson model can be used for a binary outcome (I have seen
this done in the past) or whether binomial modeling is strictly more correct

b) whether evidence of underdispersion and non-normally distributed
deviance residuals is indicative of a poor fit in these two models, or
whether it is simply an artifact of the binary outcome."

---------------------------------------------------------------------------
-----------

Here is a summary of the responses I received:

"I'm not an expert, but I would have said that a binomial
 outcome variable would not be grounds to reject the
 assumption of an underlying Poisson distribution
 - you'll always get back to binomial if you reduce
 the time period enough."

Jon Heron, PhD
Research Statistician
Avon Longitudinal Study of Parents and Children

"This is discussed regularly. A paper was published earlier this year
in stats in medicine (I think) indicating that you can use poisson
regression in this situaton provided you use the robust variance
estimator because the error term is not truly poisson.

My own experience is that using a glm with a log link and binomial
error term works well in SAS but does tend to come unstuck in
stata. In most situations the binomial model and the poisson model
give (with the robust variance estimator) give a very similar answer."

Patrick McElduff
Lecturer
The Medical School
The University of Manchester

"There is an article by G Zou in the American Journal of Epidmiology (Vol.
159, No. 7, 2004)
about a robust variance estimator for Poisson regression for binomial data."

Angelika Schaffrath Rosario, Statistician

"Poisson and binomial regression are equivalent in analysing the dataset
you are referring to, as long as you include the right terms in the Poisson
model.

Underdispersion/overdispersion is basically a common source of problem in
Poisson regression. There are many ways to tackle this problem, one of
which is
to adjust the standard errors of the estimated coefficients by the
appropriate factor. You can find more info regarding this problem by
consulting one
of the many standard textbooks on the topic. I would recommend the
"Regression Analysis of Count Data" by Cameron"

Dr D N Lambrou
Statistician
Athens-Greece

"With reference to the second question: in Poisson regression when the mean
is small, you can get apparent underdispersion. See:

Wood GR. Assessing goodness of fit for Poisson and negative binomial
models with low mean. Communications in Statistics - Theory and Methods
2002;31:1977-2001."

David Scott
Department of Statistics, Tamaki Campus
The University of Auckland

"1  If the mean probability is low, poisson and binomial distributions look
very similar.

2  If you fit a Poisson model conditional on marginal means, then it is
equivalent to a binomial-logistic model.  The likelihood equations are the
same.  Many textbooks will show the theory on this.

3  If you are analysing a binary outcome, then the residual deviance will
NOT give you a valid test of under- or over-dispersion.  David Williams'
equations become undefined for n=1.

4  Similarly, since the observed values can only take the values 0 or 1, the
individuals residuals will always look odd, and should certainly not be
expected to form a normal distribution.

For binary outcomes, 3 and 4 will be true regardless of what model you fit.

In your case, I wouldn't be too worried about the difference between
relative risks and odds, particularly if the outcome is relatively rare, but
you can always convert between the two measures."

Brian Miller, PhD, CStat
Director of Research Operations
IOM

"I've come across the sort of problems you're facing and written a few
things
about it.
1) It's OK to use the Poisson even though you cannot get a count greater
than 1.
If you write out the deviances you'll see why. However, binomial with
log-link
should be fine although it is not a canonical link and so fitting is
trickier.
2) The underdispersion is common, but is not often remarked upon. I think it
occurs when  the number of events is low, and so deviance is not well
approximated  by a chi-squared distribution, and so expected value is not
the
degrees of freedom. I suggest robust SEs or bootstrap estimates for SEs."

Mike Campbell

"I was interested in your query (although I feel a bit inexperienced to
offer my
own opinions - which are a) I dont think so and b) if you have
underdispersion
try altering the scale parameter {if your using SAS - dscale or pscale
options}"

Dave Jackson

---------------------------------------------------------------------------
-------

Mrs Susanna Dodd
Centre for Medical Statistics and Health Evaluation
Shelley's Cottage
Brownlow Street
University of Liverpool
Liverpool
L69 3GS

[log in to unmask]