Thanks to Bob O'Hara, Mike Thompson and Simon Jackman for their
comments.
Bob pointed out using the mean of sigma[] in the likelihood for
each t[j] meant that the likelihood for each t[j] included a
contribution from _every_ sigma[i] (i=1,...,J). This complicates
the likelihood horrendously and slows compilation and updating.
He recommended two solutions. Firstly, use an estimate of the mean of
sigma[] and use that estimate as a constant to centre values of
sigma[j]. I tried that running the model once (without any attempt to
centre sigma[]). This allowed me to estimate the posterior mean of
sigma[]. I then used this estimate as a constant which I subtracted
from each sigma[j] in the second run of the model. Centring in this
way is approximate, but it is good enough. It successfully reduced
the correlation between the coefficient of sigma[j] and the intercept.
Secondly, he pointed out that OpenBUGS can update several parameters
together (if it recognises that it can do so), in which case there
is no need for mean centring. I haven't tried that.
Simon described how this can cause issues with the identifiability
of estimates produced in latent variable analysis and suggested
concentrating on the output from WinBUGS rather than trying to
transform variables on the fly. The full text of his email is
included below.
Mike suggested reparameterising the Weibull distribution:
multiplying the linear predictor by 'r' which you then estimate.
The text of his email is included below, as is my original
post.
Thanks again to all,
Robert
Simon's email:
In my own research (solo and with various students and co-authors),
we've repeatedly run into this problem. our solution is to not center
in WinBUGS, but impose that kind of constraint afterwards, by
post-processing the WinBUGS output.
this is a standard problem in latent variable models. a unidimensional
latent variable is often only identified up to scale and a location
shift. e.g., consider a 2-parameter IRT type setup, shown in the
following fragment of winbugs code
for(i in 1:n){
for(j in 1:m){
logit(p[i,j]) <- x[i]*beta[j,1] + beta[j,2]
y[i,j] ~ dbern(p[i,j]
}
}
where x[i] is the latent ability of test-taker i, the beta parameters
are item-specific discrimination and difficulty parameters, and the
observed data is a matrix of binary responses (y_{ij} = 1 if test-taker
i gets item j correct, otherwise 0). Clearly the model isn't
identified (any linear transform of the x can be offset by transforms
of the betas). One solution is to impose the constraint that the x[i]
have mean 0 and standard deviation 1 across the n test-takers. Putting
this into the winbugs code produces a dramatic slowdown in compile time
and running time, of the sort you described. So, we ignore the lack of
identification as winbugs runs, and then, with each iteration on
WinBUGS output, center and scale the x[i], and then apply the
appropriate offsetting transforms to the beta parameters. Convergence
diagnostics and plots may look great for the unidentified parameters,
but will sharpen up considerably for the identified parameters.
There are lots of examples of this general type of thing out there:
i.e., the Gibbs sampler producing apparently lousy performance wrt
unidentified parameters, but chugging away quite nicely wrt
lower-dimensional set of identified parameters. My favorite example is
in the Carlin and Louis book; consider modeling a single data point, y
= 0, as y ~ dnorm(mu[1] + mu[2], 1). clearly mu[1] and mu[2] aren't
identified, but theta = mu[1]+mu[2] is. compare traceplots etc of
mu[1], mu[2], and theta. See the example called "unidentified" on my
web pages, http://jackman.stanford.edu/mcmc
Mike's email:
The Weibull distribution may benefit from a reparameterization
similar to what Prof. William Q. Meeker of Iowa State uses. If
you look in the WinBUGS help manual, WinBUGS parameterizes the
Weibull using v and lambda as the shape and scale parameters,
resp.: . Where lambda corresponds to your variable hr[j].
Meeker parameterizes the Weibull using beta and eta as the shape
and scale parameters, resp.:. Obviously, v and beta are
equivalent, but Making the regression model estimate log(eta) is
more strongly motivated than making it estimate log(lambda) or in
your case, -log(hr[j]), which is -lp[j] in your model. (I.e.,
using a "scale accelerated failure time (SAFT)" model is
motivated.)
So I suggest using: hr[j] <- exp(r*lp[j]) I tried this on the
"Mice" WinBUGS example problem. As a result of the new
parameterization, the autocorrelation in the estimate of the
shape parameter (variable "r" in that model) was dramatically
reduced. The Monte Carlo error of the estimated mean for r also
decreased (for the same MC sample size). I haven't a clue if this
is causing issues in your problem, but it may help.
The original posting:
My dataset is a series of readings of blood pressure taken from
a cohort of ~2000 patients. I also have data on how long they survived
before they had a stroke.
I am trying to model the longitudinal blood pressure and stroke
survival processes jointly - giving each patient their own
intercept for blood pressure (a01) and within-individual error
for readings of blood pressure (precision=tau or standard
deviation=sigma). My model specifies that both the intercept,
a01, and standard deviation, sigma, predict stroke risk. That is
individuals j's survival time ~ Weibull(r, hr[j]) where
-log(hr[j]) = b0 + b1*a01[j] + b2*sigma[j].
The trouble is that sigma isn't centred on zero and consequently
the correlation between b0 and b2 is very high (>90%) and the
autocorrelation for both parameters is substantial (>90%) even
at large lags (eg. 50).
If I try to calculate the mean of sigma[]
(ie. sigma.mean <- mean(sigma[])), there is no problem,
but if I then try to centre sigma (ie. centred.sigma[j] <- sigma[j] -
sigma.mean), and use that in the survival model,
then the model suddenly takes at least half an hour to compile
and I haven't yet waited long enough for it to complete a
single update. Without the attempted centering, it takes maybe 15
seconds to compile, and updates fine.
Does anybody have any suggestions about why this could be happening?
model {
for (j in 1:J) { # For each individual
# Specify the distribution of the random intercept
# and within-individual error for blood pressure
# model
a01[j] ~ dnorm(0, tau2)
tau[j] ~ dgamma(g1, g2)
# The survival model uses the standard deviation of
# each individual's blood pressure, not the precision.
sigma[j] <- 1/sqrt(tau[j])
# Try to centre the standard deviation to reduce the
# correlation between its survival coefficient (b2)
# and the intercept (b0).
# Adding this line causes the problem described
# above
centred.sigma[j] <- sigma[j] - sigma.mean
# Survival model
lp[j] <- b0 + b1*a01[j] + b2*centred.sigma[j]
hr[j] <- exp(-lp[j])
t[j] ~ dweib(r,hr[j])I(t.cen[j],)
}
for( i in 1:N){ # For each reading of blood pressure
# Set up an random effects model for the progression
# of blood pressure with time (called "start" here).
# The dataset is unbalanced, so I use the vector idno
# to match each reading to the patient from whom it comes.
lsbp[i] ~ dnorm( mu[i], tau[idno[i]] )
mu[i] <- a0 + a01[idno[i]] + a1*start[i]
}
# Calculate the mean of sigma for use in centring
sigma.mean <- mean(sigma[])
a0 ~ dnorm(4.5, 4)
a1 ~ dnorm(-.1, 25)
b0 ~ dnorm(3.5, .25)
b1 ~ dnorm(0, .0001)
b2 ~ dnorm(0, .0001)
r ~ dgamma(1,.001)
g1 ~ dexp(.3)
g2 ~ dgamma(1,.1)
tau2 ~ dgamma(1, .1)
sigma2 <- 1/sqrt(tau2)
}
-------------------------------------------------------------------
This list is for discussion of modelling issues and the BUGS software.
For help with crashes and error messages, first mail [log in to unmask]
To mail the BUGS list, mail to [log in to unmask]
Before mailing, please check the archive at www.jiscmail.ac.uk/lists/bugs.html
Please do not mail attachments to the list.
To leave the BUGS list, send LEAVE BUGS to [log in to unmask]
If this fails, mail [log in to unmask], NOT the whole list
|