thank you all for sending me your thoughts on the use of informative
priors to improve convergence behaviour. Below follows a compilation of
ideas.
***
If you are asking about latent variables alone, I haven't had that
experience.
But, I often run into analyses that just will not converge without an
informative prior. However, I'm using the term loosely. Usually, I'll
use a
vague, but informative prior, i.e. it has most of it's mass in a logical
range
of the parameter, but it's not attempting to actually identify the true
location. Sensitivity? I don't know since these runs don't converge
otherwise.
I suppose I could choose several different vague priors to see if the
results
are the same. I hadn't actually thought of that until reading your
post. I'll
give it a try next time. And, maybe that is the solution to your
problem as
well.
***
In regards to Stefan Van Dongen's issue about non-convergence, I must
say that I
have the same problem with the linear coefficients of fully bayesian
generalized
linear models. My application is somewhat different than Stefan's, as I
use
WINBUGS for Poisson regression to smooth diease rates in small area maps
and to
assess the effect of covariates. For smoothing, there is never a
problem, as I
get rapid convergence of the predicted diease counts; however, for
linear terms,
I can not obtain convergence and therefore can not draw any defensible
inference
about these terms. I usually run 3 independent chains, where initial
values for
the linear terms are chosen based on preliminary maximum likelihood
analysis (or
pseudo ML). Initial values are chosen to represent the MLE estimate and
+/- 4
standard errors in order to represent over-dispersed initial values, but
not
wildly over-dispersed.
As with Stefan, I have also experimented with more informative priors.
This
sometimes leads to "apparent" convergence, but it does not hold as one
continues
to run the chain.
I have discovered through conversation that others experience this same
problem.
Perhaps, however, this should not be a surprise since Eberly and Carlin
(Statistics in Medicine 2000; 19:2279-2294) point out that this may be
related
to problems with model identifiability. I use the convolution prior
discussed
by Eberly and Carlin, where two variance component are included--a
spatially
structured and unstructured one; however, I continue to have problems
with
convergence of the linear coefficients even if I proceed with only one
variance
component (either structured or not).
I have always felt that some of the best guidance that us "users" need
from
those more knowledgeable in full bayes analysis is in the selection of
initial
values and priors. However, this particular problem of non-convergence
of
linear coefficients in GLM's may simply be unsolvable. If interest lies
with
smoothing, they work beautifully; however, if interest lies with
quantifying the
strength of covariate effects, then a non-Bayesian approach, such as
pseudo-MLE
may be necessary (or perhaps Empirical Bayes). Perhaps these
considerations also
apply to Stefan's problem.
***
Yes. I think the use of informative priors is often to be recommended in
such
cases. I have some experience with mixture models, hidden Markov models
and such
like as well as some other kinds of latent variable models. Often there
is not a
unique maximum in the likelihood and the prior has the function of
pushing the
posterior towards one of the maxima of the likelihood. An obvious
example is a
mixture model with two components. Suppose, for example, each component
is a
normal distribution with known and equal variances so that the
components only
differ in the unknown means M1, M2. We also have a mixing parameter P,
the
probability of belonging to component 1. If we have a "non-informative"
prior
in that we have a uniform prior for P and make M1 and M2 exchangeable,
then
there is no way to distinguish between, say, P=0.3, M1=10, M2=20 and,
P=0.7,
M1=20, M2=10. This is, of course, a very simple example but more or less
the
same thing happens in more complicated examples. We can impose
constraints but
this can sometimes be rather unnatural and it might be better to impose
a "soft"
constraint by using the prior.
You use the phrase "biologically meaningful." This suggests that you do,
in
fact, have genuine prior information which can be used in this way. The
trick, I
think, is to devise a prior which uses the information which you are
happy to
use without being "informative" about aspects where you do not want to
be
"informative." This is not always easy but I think often there is real
prior
information which can be used.
In the example above, we might think that we can use a constraint such
as M1<M2.
Well, what happens then if you have a set of data where really all of
the
observations belong to one of the components? We have no way to tell
whether it
is the first or the second. So, instead, we might try P>0.5 so that, if
there is
only one component, it will be the first. This might work better
(although we
might have problems if P is actually close to 0.5). However, in
practical, e.g.
biological, terms we might want to know which component it really is. I
guess
that this may often be the case with latent variable models. It might be
better
to give M1 and M2 different priors so that the model "knows" which is
the more
likely component for the data. Just what you are prepared to do in this
way
depends on the (biological) system you are modelling and what you know
about it.
***
As usual, I am sure that there are several of us that would like to take
a
peak at your code, so as to better understand the nature of your
problem.
I have frequently been able to avoid this problem by making sure that
the
initial values are near their posterior mode.
***
--
Dr. Stefan Van Dongen
Group of Animal Ecology
Department of Biology
University of Antwerp
Universiteitsplein 1
B-2610 Wilrijk, Belgium
Tel: + 32 (0)3 820 22 61
Fax: + 32 (0)3 820 22 71
Email: [log in to unmask]
URL: http://bio-www.uia.ac.be/u/svdongen/index.html
-------------------------------------------------------------------
To mail the BUGS list, mail to [log in to unmask]
You can search old messages at www.jiscmail.ac.uk/lists/bugs.html
To leave the BUGS list, send LEAVE BUGS to [log in to unmask]
If this fails, mail [log in to unmask], NOT the whole list
This list is for discussion of modelling issues and the BUGS
software. For help with crashes and error messages, first mail
[log in to unmask]
|