Andrew Gelman's post about inverse-gamma prior for variance parameters
(gamma distribution for precision) had me puzzled for some time. I was
quite surprised to see how much less informative
sigma.theta~dunif(0,1000) was compared to the more common
tau.theta~dgamma(.0001,.0001). This uninformativeness expresses itself
not only in a diffuse (high variance) posterior distribution of sigma,
but also in a theta values as well. I admit that I was beginning to
dispair for the future of our beloved gamma priors.
It was no real comfort that the uniform sigma was a kluge, that required
setting the range of the uniform to match the data, since if the true
standard deviation was in the thousands, we could just bump it to
dunif(0,1.0E6) or wider. We could in fact set this upper limit to the
maximum floating point value for the system to provide a less artificial
prior.
The problem with this uniform prior on sigma is that you can produce
even more uninformative priors in this same method. If a uniform prior
on sigma is a good prior, then why not a uniform prior on sigma^2? It
turns out that a uniform variance prior produces a sigma.theta
distribution that has an even wider (higher variance) posterior, and
higher variance on the theta posteriors as well. If we take the
variance of sigma.theta, mu.theta, and theta posteriors to be a good
metric of uninformativeness, then a uniform variance prior is certainly
less informative than a uniform standard deviation prior.
Now if we think that a uniform distribution on sigma^2 is a good idea,
then you can imagine how good a uniform distribution on sigma^4 or
sigma^6 will be. Unfortunately, the simulations either crash or are
saved by bumping up against the upper limit on the uniform distribution
(making the prior intrinsically informative). But we can see from a
uniform distribution on sigma^2.5 that the 'uninformativeness' of the
sigma prior increases with the exponent.
The problem with this uniform approach is that it creates an
artificially fat tail (prior probabilities increase away from zero)
which artificially inflates the resulting sigma.theta posterior means
and variances. This produces a corresponding increase in the variances
of the resulting theta posteriors. Given its kluge nature, the fact
that we can arbitrarily increase the vagueness of sigma by increasing
the exponent of the uniform prior, and that there is no 'obvious' choice
for the value of this exponent, we should be very cautious about using
cropped uniform (improper) priors for variance parameters.
In light of the above, my doubts about the vagueness of the
inverse-gamma prior were the result of a comparison with a prior that
was not in fact uninformative, but rather artificially promoted vague
posteriors. If we rule out chopped improper (uniform) distributions,
then it is hard to beat vague gammas (i.e. gammas with very small value
parameters) for vagueness.
- Finn Krogstad
P.S. This is not to say that chopped uniform priors do not have their
place. Their computational simplicity can be useful, especially when we
don't have conjugacy. But in general, the realization that we can
always make a more vague uniform prior should make us uncomfortable with
the vagueness of chopped uniform priors.
-------------------------------------------------------------------
This list is for discussion of modelling issues and the BUGS software.
For help with crashes and error messages, first mail [log in to unmask]
To mail the BUGS list, mail to [log in to unmask]
Before mailing, please check the archive at www.jiscmail.ac.uk/lists/bugs.html
Please do not mail attachments to the list.
To leave the BUGS list, send LEAVE BUGS to [log in to unmask]
If this fails, mail [log in to unmask], NOT the whole list
|