With respect to Wayne Thogmartin's recent question, I apologize for not
being more clear when I said " ... convergence does not happen readily for
linear coefficients associated with covariates. It works great for
producing smoothed response variables, but not for making inference about
these linear coefficients."
Let me try to clarify. The usual hierarchical Poisson regression model
used in disease mapping treats
the observed number of cases in each geographic unit, y_i, as distributed
Poisson(E_i theta_i) for population at risk E_i and relative risk theta_i,
where log(theta_i) = log(E_i) + alpha + Xbeta + u_i + s_i for common
intercept alpha, covariate vector X and covariate linear coefficient vector
beta, along with the random effects u_i for unstructured (exchangable)
variability and s_i for spatially structured variability modeled through
the CAR model.
Vague normal priors are usually assigned to the linear coefficients "alpha"
and "beta" and gamma hyperpriors are assigned to precision terms associated
with the random effects u_i and s_i.
Potentially, we can make "Bayesian" inference about any of the stochastic
nodes in the hierarchical model stated above - that is if we can obtain a
large sample of the stationary posterior distribution for the node of
interest, which we achieve through MCMC simulation (in this case, the Gibbs
sampler). We all know it is dangerous to run only one Markov chain,
because it is difficult to assure that we have run enough iterations to
achieve "burn-in" so that subsequent iterations will be from the stationary
distibution. Several independent chains should be ran and trace plots of
the posterior simulated values for all chains should mix together in the
same parameter space, revealing white noise variation with no trends.
Now, my experience with modeling disease rates for small areas (US postal
ZIP codes in NY state) shows convergence is achieved very rapidly for the
relative risk parameter "theta_i". I further observe that simulated
posterior distributions for relative risk are very robust (don't change)
with respect to choice of priors and hyperpriors. Therefore, I feel very
confident with my posterior distributions of relative risk, which can be
used for smoothing disease rates by choosing some measure of central
tendency (along with alot of other information that can be obtained from an
empirical distribution).
I do not, however, achieve convergence for any of the linear coefficients
in the model (neither for "alpha" or any "beta"), regardless of prior
specification (vague or fairly informative) or on the initial values chosen
to seed independent Markov Chains (I run three chains). Therefore, I do
not feel that I can defend any inference about these linear coefficients
based on a posterior distribution.
My observed rapid convergence of relative risk and apparent lack of
convergence for the linear coefficients of this model occurs whether I
include only unstuctured variance (u_i) or sructured variance (s_i) or both
(the convolution model).
Perhaps some of my problem is that I work with covariates that have a weak
association with the log of relative risk, but I would still hope to
achieve convergence of the Markov Chains and show a posterior distribution
that covers zero.
My observed empirical results are actually in line with what is expected
according to Eberly and Carlin (2000, Statistics in Medicine, 19:2279-2294)
For now, I use the fully Bayesian approach for obtaining large samples of
the posterior distributions of relative risk, but do not use it to make
inference about covariates in a spatial regression model. This work is
expected to be sumitted for publication within one month.
Regards,
Glen Johnson
________________________________________________
Glen D. Johnson, PhD, MS, MA
New York State Cancer Registry
Bureau of Chronic Disease Epidemiology and Surveillance
New York State Department of Health
Corning Tower, Room 710
Empire State Plaza, Albany NY 12237 USA
Phone: 518-474-2255 Fax: 518-473-6789
email: [log in to unmask]
Wayne E
Thogmartin To: [log in to unmask]
<wthogmartin@USGS cc:
.GOV> Subject: Re: Responses and discussion for covariates, priors and model
Sent by: "(The selection
BUGS software
mailing list)"
<[log in to unmask]
.UK>
09/24/2003 09:23
AM
Please respond to
Wayne E
Thogmartin
In Ayaz' most recent post to the listserv (now 2 weeks old), he quoted
correspondence from a contributor that made me wonder what exactly the
contributor meant:
'Also, I caution you, given that I've worked alot
with these hierarchical spatial Poisson models, convergence does not
happen readily for linear coefficients associated with covariates. It
works great for producing smoothed response variables, but not for
making inference about these linear coefficients.'
I too am working with hierarchical spatial Poisson models and agree
whole-heartedly that convergence doesn't occur quickly. I typically need
to allow my models to run for 18,000 iterations before convergence occurs.
I'm mystified, however, by the second sentence of this quote 'It works
great for producing smoothed response variables, but not for making
inference about these linear coefficients.' Can someone (the original
contributor) please elaborate? I think I'm not entirely understanding this
point.
Thanks,
Wayne Thogmartin
Wayne E. Thogmartin, PhD
Statistician (Biology)
USGS Upper Midwest Environmental Sciences Center
2630 Fanta Reed Road
La Crosse, WI 54603
608.781.6309
[log in to unmask]
www.umesc.usgs.gov/staff/bios/wet0.html
ayazh
<[log in to unmask] To:
[log in to unmask]
SU.CA> cc:
Sent by: "(The Subject: [BUGS] Responses
and discussion for covariates, priors and model
selection
BUGS software
mailing list)"
<[log in to unmask]
.UK>
09/06/2003 02:50
AM
Please respond to
0ah9
Hello
Below are responses to questions I posed about covariates, priors and
model selection. It seems there are few ways to choose the best model in
WinBugs.
The way I am approaching this problem is somewhat Bayesian-like in that
I look at how the univariate model behaves in terms of the density and
history plots for beta, clustering and heterogeneity components of the
model. Depending on or conditional on which univariate models have sig.
values for beta and good history and density plots for unknown
paramters, I use them to contruct the full model. In this way DIC was
reduced and the model behaved well in terms of exploring the paramter
space, mixing of markov chains, etc.
I have not found any reason to reject nor accept this method as a valid
one since no literature to my knowledge exists about it yet! So any
comments or criticisms are welcome.
Thanks
Ayaz
Thanks to Dr. Finn Krogstadt and Dr. Glen D. Johnson for responses
below.
Question I asked was how to prioritize selection of models either by
lower DIC or better history plots and other output from WinBugs of
parameter space.
WRT #2, your DICs are essentially equal, so choose the model with the
stable "time series" (I believe you mean the "trace plots" for
monitoring convergence and behavior of a Markov Chain wrt to your
coefficient beta). Also, if you are talking about traces, then you
should be monitoring at least 3 independednt Markov Chains and run the
iterations until all of the traces converge, overlapping each other and
varying like white noise with no trend. The Gelman-Rubin diagnostics
were designed to monitor aspects of these traces in order to assess
convergence, but just looking at the traces will pretty much tell you if
you have convergence. Also, I caution you, given that I've worked alot
with these hierarchical spatial Poisson models, convergence does not
happen readily for linear coefficients associated with covariates. It
works great for producing smoothed response variables, but not for
making inference about these linear coefficients.
> 3. IS THIS APPROACH VALID? -->
> I ran univariate models for 12 variables using 3 different
> priors which makes 36 models in total. Full model was chosen
> based on following
> criteria:
> -beta value for covariate is significant
> -tau.h and tau.c for non-spatial and spatial random effects,
> respectively, and beta value of covariate have a time-series
> plot which shows good mixing and low variation, i.e. not a
> very narrow band of values with large spikes in the estimate
> occuring randomly throughout the simulation.
Ayaz,
For model selection, there really is no substitute for the model
likelihood or posterior probability. You might want to take a look at
the PINES example (in some example sets but not others).
-------------------------------------------------------------------
This list is for discussion of modelling issues and the BUGS software.
For help with crashes and error messages, first mail [log in to unmask]
To mail the BUGS list, mail to [log in to unmask]
Before mailing, please check the archive at
www.jiscmail.ac.uk/lists/bugs.html
Please do not mail attachments to the list.
To leave the BUGS list, send LEAVE BUGS to [log in to unmask]
If this fails, mail [log in to unmask], NOT the whole list
-------------------------------------------------------------------
This list is for discussion of modelling issues and the BUGS software.
For help with crashes and error messages, first mail [log in to unmask]
To mail the BUGS list, mail to [log in to unmask]
Before mailing, please check the archive at
www.jiscmail.ac.uk/lists/bugs.html
Please do not mail attachments to the list.
To leave the BUGS list, send LEAVE BUGS to [log in to unmask]
If this fails, mail [log in to unmask], NOT the whole list
-------------------------------------------------------------------
This list is for discussion of modelling issues and the BUGS software.
For help with crashes and error messages, first mail [log in to unmask]
To mail the BUGS list, mail to [log in to unmask]
Before mailing, please check the archive at www.jiscmail.ac.uk/lists/bugs.html
Please do not mail attachments to the list.
To leave the BUGS list, send LEAVE BUGS to [log in to unmask]
If this fails, mail [log in to unmask], NOT the whole list
|