There were many responses to my question, thank you!
Here is the question again:
My data are counts in two 4x4 tables. It seems reasonable to model the
counts as Multinomial(N[k],p[ijk]) k=1,2, i=1,...,4, j=1,...,4; with
Dirichlet prior on the probabilities p and Poisson prior on the total
number N. WinBUGS only allows fixed N in the multinomial specification; I
read this is because of simultaneous updating problems. Is there a slick
trick to get around this?
Here are the responses:
A. Use Poisson distributions instead!
If N_1 ~ Po(lam_1), N_2 ~ Po(lam_2), ..., N_n ~ Po(lam_n), and
T=sum(N_i), and S.lam=sum(lam_n), then
N_1 | T ~ Bin(T, S.lam)
and more generally the vector of N_i's is multinomially distributed.
This is one of the more useful pieces of distributional theory! It means
you can put priors on the counts directly, and transform to the
multinomial if you need to.
Also note that the Dirichlet distribution can be written as a function of
gamma distributions, in a similar way: the parameters of the Dirichlet are
the shape parameters of the gamma.
The sum of the lambdas is, of course, the expected value of the total
count, so you can easily convert your priors to the new parameterisation.
Sometimes I suspect this is all too elegant to be true, and the
probabalists are just laughing at us behind our backs.
B. You can always express a multinomial pmf as a product of conditional
binomial pmfs. That's what I do when using WinBUGS, and it seems to work
fine.
C. The same trick used to allow non-integer values of N should work. I
replace the multinomial distribution with a sequence of conditional
binomial distributions, for which BUGS now allows non-integer N. Here is
a simple script showing the multinomial model and how I replaced it. You
should then be able to put a Poisson prior on N.
#Multinomial distribution for number of failures - 4 trains
model {
n[1:4] ~ dmulti(alpha[1:4], N)
#Dirichlet prior for alpha
alpha[1:4] ~ ddirch(theta[1:4])
#The following trick allows a prior to be specified for the Dirichlet
param=
eters
for (j in 1:4) {
r[j] <- u[j]/sum(u[1:4])
u[j] ~ dgamma(theta[j], 1)
theta[j] ~ dgamma(1, 0.001)} # Diffuse prior for Dirichlet parameters
#Uniform prior on theta[]
#theta[1] <- 1
#theta[2] <- 1
#theta[3] <- 1
#theta[4] <- 1
N <- sum(n[1:4])
}
#Sequence of conditional binomial distributions - allows non-integer
values of n[] and N
model=09{
n[1] ~ dbin(alpha[1], N)
n[2] ~ dbin(p[2], N2)
N2 <- N - n[1]
p[2] <- alpha[2]/(alpha[2] + alpha[3] +alpha[4])
n[3] ~ dbin(p[3], N3)
N3 <- N - n[1] - n[2]
p[3] <- alpha[3]/(alpha[3] + alpha[4])
alpha[1:4] ~ ddirch(theta[1:4])
#The following trick allows a prior to be specified for the Dirichlet
param=
eters
for (j in 1:4)=09{
r[j] <- u[j]/sum(u[1:4])
u[j] ~ dgamma(theta[j], 1)
theta[j] ~ dgamma(1, 0.001)} # Diffuse prior for Dirichlet parameters
#theta[1] <- 1
#theta[2] <- 1
#theta[3] <- 1
#theta[4] <- 1
N <- sum(n[1:4])
}
#Data for 4 trains
data
list(n=3Dc(260,5,1.5, 1.3))
D. WBDev (new since 2002, see the WinBUGS site) will allow you to
specify the joint distribution of the N's and cell counts; with this model
specifying the (log) likelihood should be reasonably straightforward, and
the priors can be implemented in standard WinBUGS. It'll take some coding
though, and is hardly a "slick trick".
Failing that, you could try appending an extra cell to p; turning your 4x4
table into a multinomial of length 16, you'd get a 17th cell. Then use a
really big, fixed N for the 17-cell model, and specify cell probabilities
for the first 16 entries conditional on the observed value of the count in
the 17th. This is really only a lash-up solution for something better, it
may not work well (if at all!)
E. Assuming you have separate tables, and you want to generate future
tables...
1. I would use the Poisson notation, if in fact you have only counts n
[k,i] where (i designates a row, column of table k, e.g. i=13,k=2 might be
row 4 column 1 in table 2. ) One can choose to assume the counts
conditional on the rates, i.i.d Poisson random variables.
2. The prior for the rates would be a gamma prior with shape a[k,i], and
rate b[k]. 3. The use of the Dirichlet prior did not (in previous
versions) let one to sample the hyper parameters (say the alphas in the
distribution) where as the gamma parameters can be sampled.
4. If n[k,i] has a Poisson distribution with rate lambda[k,i] then
conditional on sum(n[k,]) n[k,i] has multinomial distribution with p[k,i]
=lambda[k,i]/sum(lambda[k,]). If lambda[k,i] has a gamma distribution with
parameters a[k,i] and b[k] then p[k,i] has a Dirichlet distribution with
parameters alpha[k,i]=a[k,i]/sum(a[k,]). Note that you now have another
set of parameters b[k] to provide priors for, but then again you would
have gotten these from your distribution for N[k].
5. If N has a Poisson distribution with rate lambda and n[k] | N is
multinomial with rate p[k] then n[k] is Poisson with rate lambda*p[k] (a
thinned Poisson process). Thus, If you want to use the multinomial sample
when N[k] is not known but you have generated samples of p[k,i], and
Poisson rates lambda[k], then use the Poisson samples to get the
predictive distribution (samples).
lambda1[k,i] <-lambda[k]*p[k,i] ###Sum lambda1[k,]=lambda[k]
npredictive[k,i]~dpois(lambda[k,i]) ###Future samples
Note here that the distribution of the sum(npredictive[k,]) given p[k,i]
and
lambda[k] is Poisson with rate lambda[k].
For a given set of p[k,i] and lambda[k] you can generate many sets of the
two tables using the following: This method gives you 1000 sets of two
tables for each set of parameter samples p[1:2,1:16] and lambda[1:2]:
E.G within a sampling loop
for(l in 1:1000)
{
for(k in 1:2)
{
for(i in 1:16)
{
npredictive[k,i,l]<-dpois(lambda[k,i]))
}
}
}
I tested the triple indexing scheme, but you must have the k index first,
so that say p[k,,] is not fragmented.
F. Doing it directly in R should be possible -- I don't think that
iterative solutions are needed -- you should be able to write
down the posterior distributions directly if you set it up
right. Then you can use simulation to check the fit of the
models: equal Poisson rates, equal proportions, independence of
row and columns.
-------------------------------------------------------------------
This list is for discussion of modelling issues and the BUGS software.
For help with crashes and error messages, first mail [log in to unmask]
To mail the BUGS list, mail to [log in to unmask]
Before mailing, please check the archive at www.jiscmail.ac.uk/lists/bugs.html
Please do not mail attachments to the list.
To leave the BUGS list, send LEAVE BUGS to [log in to unmask]
If this fails, mail [log in to unmask], NOT the whole list
|