Email discussion lists for the UK Education and Research communities

## BUGS@JISCMAIL.AC.UK

#### View:

 Message: [ First | Previous | Next | Last ] By Topic: [ First | Previous | Next | Last ] By Author: [ First | Previous | Next | Last ] Font: Proportional Font
 LISTSERV Archives BUGS Home BUGS 2006

#### Options

Subject:

responses to: multinomial with priors on N and p

From:

Date:

Wed, 10 May 2006 03:33:41 +0100

Content-Type:

text/plain

Parts/Attachments:

 text/plain (177 lines)
 ```There were many responses to my question, thank you! Here is the question again: My data are counts in two 4x4 tables. It seems reasonable to model the counts as Multinomial(N[k],p[ijk]) k=1,2, i=1,...,4, j=1,...,4; with Dirichlet prior on the probabilities p and Poisson prior on the total number N. WinBUGS only allows fixed N in the multinomial specification; I read this is because of simultaneous updating problems. Is there a slick trick to get around this? Here are the responses: A. Use Poisson distributions instead!  If N_1 ~ Po(lam_1), N_2 ~ Po(lam_2), ..., N_n ~ Po(lam_n), and  T=sum(N_i), and S.lam=sum(lam_n), then  N_1 | T ~ Bin(T, S.lam) and more generally the vector of N_i's is multinomially distributed. This is one of the more useful pieces of distributional theory! It means you can put priors on the counts directly, and transform to the multinomial if you need to. Also note that the Dirichlet distribution can be written as a function of gamma distributions, in a similar way: the parameters of the Dirichlet are the shape parameters of the gamma. The sum of the lambdas is, of course, the expected value of the total count, so you can easily convert your priors to the new parameterisation. Sometimes I suspect this is all too elegant to be true, and the probabalists are just laughing at us behind our backs. B. You can always express a multinomial pmf as a product of conditional binomial pmfs. That's what I do when using WinBUGS, and it seems to work fine. C. The same trick used to allow non-integer values of N should work. I replace the multinomial distribution with a sequence of conditional binomial distributions, for which BUGS now allows non-integer N. Here is a simple script showing the multinomial model and how I replaced it. You should then be able to put a Poisson prior on N. #Multinomial distribution for number of failures - 4 trains model { n[1:4] ~ dmulti(alpha[1:4], N) #Dirichlet prior for alpha alpha[1:4] ~ ddirch(theta[1:4]) #The following trick allows a prior to be specified for the Dirichlet param= eters for (j in 1:4) { r[j] <- u[j]/sum(u[1:4]) u[j] ~ dgamma(theta[j], 1) theta[j] ~ dgamma(1, 0.001)} # Diffuse prior for Dirichlet parameters #Uniform prior on theta[] #theta[1] <- 1 #theta[2] <- 1 #theta[3] <- 1 #theta[4] <- 1 N <- sum(n[1:4]) } #Sequence of conditional binomial distributions - allows non-integer values of n[] and N model=09{ n[1] ~ dbin(alpha[1], N) n[2] ~ dbin(p[2], N2) N2 <- N - n[1] p[2] <- alpha[2]/(alpha[2] + alpha[3] +alpha[4]) n[3] ~ dbin(p[3], N3) N3 <- N - n[1] - n[2] p[3] <- alpha[3]/(alpha[3] + alpha[4]) alpha[1:4] ~ ddirch(theta[1:4]) #The following trick allows a prior to be specified for the Dirichlet param= eters for (j in 1:4)=09{ r[j] <- u[j]/sum(u[1:4]) u[j] ~ dgamma(theta[j], 1) theta[j] ~ dgamma(1, 0.001)} # Diffuse prior for Dirichlet parameters #theta[1] <- 1 #theta[2] <- 1 #theta[3] <- 1 #theta[4] <- 1 N <- sum(n[1:4]) } #Data for 4 trains data list(n=3Dc(260,5,1.5, 1.3)) D. WBDev (new since 2002, see the WinBUGS site) will allow you to specify the joint distribution of the N's and cell counts; with this model specifying the (log) likelihood should be reasonably straightforward, and the priors can be implemented in standard WinBUGS. It'll take some coding though, and is hardly a "slick trick". Failing that, you could try appending an extra cell to p; turning your 4x4 table into a multinomial of length 16, you'd get a 17th cell. Then use a really big, fixed N for the 17-cell model, and specify cell probabilities for the first 16 entries conditional on the observed value of the count in the 17th. This is really only a lash-up solution for something better, it may not work well (if at all!) E. Assuming you have separate tables, and you want to generate future tables... 1. I would use the Poisson notation, if in fact you have only counts n [k,i] where (i designates a row, column of table k, e.g. i=13,k=2 might be row 4 column 1 in table 2. ) One can choose to assume the counts conditional on the rates, i.i.d Poisson random variables. 2. The prior for the rates would be a gamma prior with shape a[k,i], and rate b[k]. 3. The use of the Dirichlet prior did not (in previous versions) let one to sample the hyper parameters (say the alphas in the distribution) where as the gamma parameters can be sampled. 4. If n[k,i] has a Poisson distribution with rate lambda[k,i] then conditional on sum(n[k,]) n[k,i] has multinomial distribution with p[k,i] =lambda[k,i]/sum(lambda[k,]). If lambda[k,i] has a gamma distribution with parameters a[k,i] and b[k] then p[k,i] has a Dirichlet distribution with parameters alpha[k,i]=a[k,i]/sum(a[k,]). Note that you now have another set of parameters b[k] to provide priors for, but then again you would have gotten these from your distribution for N[k]. 5. If N has a Poisson distribution with rate lambda and n[k] | N is multinomial with rate p[k] then n[k] is Poisson with rate lambda*p[k] (a thinned Poisson process). Thus, If you want to use the multinomial sample when N[k] is not known but you have generated samples of p[k,i], and Poisson rates lambda[k], then use the Poisson samples to get the predictive distribution (samples). lambda1[k,i] <-lambda[k]*p[k,i] ###Sum lambda1[k,]=lambda[k] npredictive[k,i]~dpois(lambda[k,i]) ###Future samples Note here that the distribution of the sum(npredictive[k,]) given p[k,i] and lambda[k] is Poisson with rate lambda[k]. For a given set of p[k,i] and lambda[k] you can generate many sets of the two tables using the following: This method gives you 1000 sets of two tables for each set of parameter samples p[1:2,1:16] and lambda[1:2]: E.G within a sampling loop for(l in 1:1000) { for(k in 1:2) { for(i in 1:16) { npredictive[k,i,l]<-dpois(lambda[k,i])) } } } I tested the triple indexing scheme, but you must have the k index first, so that say p[k,,] is not fragmented.   F. Doing it directly in R should be possible -- I don't think that  iterative solutions are needed -- you should be able to write  down the posterior distributions directly if you set it up  right. Then you can use simulation to check the fit of the  models: equal Poisson rates, equal proportions, independence of  row and columns. ------------------------------------------------------------------- This list is for discussion of modelling issues and the BUGS software. For help with crashes and error messages, first mail [log in to unmask] To mail the BUGS list, mail to [log in to unmask] Before mailing, please check the archive at www.jiscmail.ac.uk/lists/bugs.html Please do not mail attachments to the list. To leave the BUGS list, send LEAVE BUGS to [log in to unmask] If this fails, mail [log in to unmask], NOT the whole list ```