Dear all,
Some time ago, I sent a transaction about "how to manage when some data
are associated to a logical node?", proposing a way to introduce such a
situation in a WinBUGS model. I received some advices from Andrew Millard,
David Spiegelhalter, Dale Lamps, Michael McCarthy and Etienne Le Bihan.
Many thanks to all of them, their reflections gave me insights for
continuing.
Some of the reactions were doubtful about the proposal, other were more
optimistic. Unfortunately the former seems to be right as I will try to
show.
I will use again my toy example but introducing a calculated variate with
a sum instead of a product, allowing complete knowledge of the
probabilistic distributions.
Here are the three contemplated models
-<model 0>----
# model
model {
a0 ~ dnorm(10,1)
b0 ~ dnorm(10,1)
c0 <- a0 + b0
}
# data
list(c0=75)
-<model 1>----
# model
model {
a0 ~ dnorm(10,1)
b0 ~ dnorm(10,1)
c0 <- a0 + b0
d0 ~ dnorm(c0,1000000)
}
# data
list(d0=75)
-<model 2>----
# model
model {
a1 ~ dnorm(0,2)
c0 ~ dnorm(20,0.5)
a0 <- 0.5*c0 + a1
b0 <- 0.5*c0 - a1
}
# data
list(c0=75)
--------------
<model 0> is the target model but it compiles with a fatal error in
Winbugs due to the constraint of not attributing data to the logical
node c0.
<model 1> is the proposal to avoid the difficulty. An additional random
variate, d0, is introduced with an expectation equal to c0, and almost no
variation around it. The data value is attributed to d0 which is
acceptable for WinBUGS.
<model 2> is strictly equivalent to model 0. As we are in the case of
linear combinations of Gaussian variates, it is possible to use c0 and
a1=a0-c0/2 which are non-correlated variates to define a0 and b0.
When running <model 1> and <model 2> in Winbugs, we obtain very different
resuts.
mean sd
<model 1> a0 64.9 1.686
<model 1> b0 10.1 1.686
-------------
<model 2> a0 37.5 0.706
<model 2> b0 37.5 0.706
More, the Brooks-Gelman convergence R statistics are very different, and
in favor of <model 2>.
My conclusion is that if the proposal does not work in such a so simple
situation, there is no hope that it will do in true life situations where
data sets are big and calculated variates very complicated.
Perhaps, we could follow by two more questions :
(*) what is the reason for WinBUGS not to tackle such cases? Are there
theoritical grounds?
(*) what to do in such situations? I am studying a long chain of events
with bad data sets all along the chain. My conjecture is that if I am able
to gather in one modelling all of them, I will significantly improve the
knowledge on all the chain.
Thanks for your help.
Jean-Baptiste DENIS
mél : [log in to unmask] unité de recherche BIA
tél : ##.33.(0)1.34.65.22.23 INRA
fax : ##.33.(0)1.34.65.22.17 F-78352 Jouy-en-Josas
------------< http://www.inra.fr/bia/J/ >--------------
-------------------------------------------------------------------
This list is for discussion of modelling issues and the BUGS software.
For help with crashes and error messages, first mail [log in to unmask]
To mail the BUGS list, mail to [log in to unmask]
Before mailing, please check the archive at www.jiscmail.ac.uk/lists/bugs.html
Please do not mail attachments to the list.
To leave the BUGS list, send LEAVE BUGS to [log in to unmask]
If this fails, mail [log in to unmask], NOT the whole list
|