JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for ALLSTAT Archives


ALLSTAT Archives

ALLSTAT Archives


allstat@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Monospaced Font

LISTSERV Archives

LISTSERV Archives

ALLSTAT Home

ALLSTAT Home

ALLSTAT  2004

ALLSTAT 2004

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Summary - modelling overdispersed poisson data

From:

russell ecob2 <[log in to unmask]>

Reply-To:

russell ecob2 <[log in to unmask]>

Date:

Mon, 14 Jun 2004 11:06:53 +0100

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (208 lines)

Recently I posted a request regarding the modelling of an over-dispersed
Poisson distribution (delinquency example) - here is my question and the
answers I received. I hope these may be useful to others. Many thanks to
John Hinde, Peter Flom, Bernd Genser, Michael Epelbaum (from s+ list), Peter
Lane.


Dear allstat - I wish to model a variable which is the count of a number of
items (number of types of self reported delinquency in adolescents in the
last 12 months) - a further variable takes each item and multiplies this by
the number of times engaged in before summing. These variables look like
over dispersed Poisson variables (the variance/mean ratio is 3.4, 28.0 in
the two cases and there is some evidence of an excess in the second case
especially at zero - no delinquency). Previous analyses have grouped into
an ordinal variable and modelled accordingly but I am reluctant to take
this course as I feel I would be losing data. I am also reluctant to
consider transforming this towards a normal distribution, Can anyone suggest
a useful approach to analysis? I will summarise suggestions to the list.




Hi Russell

Sounds an interesting problem.

If you think about a complete list of all possible
types of self-delinquency (j), then what you would like
for each individual (i) is the number of times engaged in
each activity Xij.
Of course, this is not recorded. What you have is
 Ni = the number of non-zero Xij

and
 Ti = sum over j of Xij

Now if Xij were Poisson everything would be OK and
Ti would be Poisson.
However, this is not likely to be the case, Xij
will probably be zero-inflated and perhaps also
overdispersed - so perhaps zero-inflated negative
binomial. They may also be correlated although this
might be modelled by a suitable random effect structure.

This would lead to some (complex?) compound model,
but it may be possible to make progress using the EM
algorithm.

More simple for a marginal analysis, one could set up
some appropriate mean-variance relationship for Ti to
reflect the above process.

Individual level covariates could then be incorporated in
the model for both Ni and Ti, with perhaps some joint model
to link some parameters.

Well just a few (perhaps not so simple) ideas.

John Hinde






I would suggest exploring Negative Binomial regression, and possibly
Zero Inflated Negative Binomial Regression.

These are available in SAS and R (and maybe other packages). If you
use SAS or R, let me know and I can help

For a non technical account, see

author = {J. S. Long},
title = {Regression models of categorical and limited dependent
variables},
publisher = {Sage},
year = {1997},

for something more technical, see

author = {A. C. Cameron and P. K. Trivedi},
editor = {},
title = {Regression analysis of count data},
publisher = {Cambridge University Press},
year = {1998},

I can also recommend some specific articles, if you like

HTH

Peter
 Articles

Greene, W. H. (1994). Accounting for excess zeros and sample selection in
negative binomial regression models. Working paper EC 94-10, Stern School
of Business, New York University.


King, G. (1989). Variance specification in event count models: From
restrictive assumptions to a generalized estimator. American Journal of
Political Science, 33, 762-784.

Lambert, D. (1992). Zero-inflated Poisson regression, with an application
to defects in manufacturing. Technometrics, 34, 1-14.


Panel on non-standard mixtures of distributions. (1989). Statistical models
and auditing. Statistical Science, 4, 2-23.

Ridout, M., Demétrio, C. G. B., & Hinde, J. Models for count data with many
zeros. presented at Proceedings of the XIXth International Biometric
Conference, Cape Town.

van den Broek, J. (1995). A score test for zero inflation in a Poisson
distribution. Biometrics, 51, 738-743.

Zorn, C. J. W. (1998). An analytic and empirical examination of
zero-inflated and hurdle Poisson specification. Sociological Methods and
Research, 26, 368-400.

R program (but spend some time learning R first)

Lindsey, J. K. (undated) Statistical libraries [Web Page]. URL
http://popgen0146uns50.unimaas.nl/~jlindsey/rcode.html.



Learning R itself is a bit tricky, but well worth the effort.


HTH

Peter

Dear Russell,
overdispersed Poisson data you should model using a robust variance
estimation approach (like GEE) or a random effects Poisson regression. In
STATA you can fit such models using the procedures xtgee (using family
Poisson) or xtpoisson. By the way, whenever you or your colleagues need
statistical help please contact BGStats.
Regards
Bernd


Russell:

I did not find the zero-inflated procedures in the Mass library of S-Plus as
useful as the ones in Stata. I have made ample use of the ones in Stata, but
now have moved beyond them as well. There are also zero inflated procedures
in LIMDEP, but I did not find those as straight forward, simple, and useful
as the ones in Stata.

There is a paper by Land, McCall, and Nagin (Sociological Methods & Research
24(4) may 1996, 387-442) on such methods with applications to criminal
careers data that you may find useful.

Sincerely,
Michael.

Dear Russell

I suggest you try negative binomial regression. An alternative is
overdispersed Poisson regression, but I think the negative binomial model is
likely to give a better description, and be more satisfactory from a
statistical point of view (e.g. results of standard model checking). The
main difference is that the Poisson model describes the variation in terms
of the counts from individuals all having the same mean, given equality of
any covariates, whereas the negative binomial corresponds to assuming that
the count from each individual is from a Poisson distribution with a mean
specific to that individual, with the distribution of the means over the
population being gamma.

You can fit these models easily in good stats packages. GenStat has a
procedure called RNEGATIVEBINOMIAL, SAS allows negbin in Proc GENMOD, and I
have heard that Stata provides it. Though I can't see negbin in function glm
in S-Plus 2000, there must be a function for it somewhere, and in R as well.

Peter Lane






 *************************************
Russell Ecob
Ecob Consulting
36 Prospecthill Road
Glasgow G42 9LE
Scotland, UK

Independent Statistical Consultant;
Honorary Research Fellow, Dept of Epidemiology and Public Health, University
College, London

***************************************

+44(0)141-649-9387
www.ecob-consulting.com
[log in to unmask]
mobile: 0779-1956934
*****************************************
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.691 / Virus Database: 452 - Release Date: 26/05/04

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

May 2024
April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
2006
2005
2004
2003
2002
2001
2000
1999
1998


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager