Dear All,
I would like to thank all of the people who responded to my question
raised on last Thursday. Here is the summary of the discussions that we
have had. Hope it helps to other people who might have the same problem.
1. There is space between "list" and "(", which caused the message
"expected variable name".
2. Beta distribution has the space of (0,1). However, in Crystal Ball, it
fits my data into a generalized beta distribution,where
z=(xmin)/(maxmin)so that x can be out of (0,1). So I have transformed my
new data to be in (0,1) and it works.
3. Normal assumptions for the prior distributions of alpha and beta are
not OK because both alpha and beta should be positive. Somebody points out
that I should have added "I(0,)" after "alpha ~dnorm(5.06,0.001)" and
"beta ~dnorm(18.59,0.001)" and it works . Actually I have no idea what the
prior distributions of alpha and beta are. The reason that I chose "5.06"
for alpha and "18.59" for beta is that I got a nice fitted
beta(alpha=5.06,beta=18.59) from the historical data, which forms the
basis of my prior.
I tested the uniform, normal, gamma prior distributions of alpha and beta
and found the results are somewhat different. Someone told me as long as I
got convergence the results should not be so dependent on the
priors.Instead, the new data should play a more important role if I use
"noninformative" priors. I hope I can get more suggestions on this issue.
4. My final goal is to update the distribution of the variable y with the
new data y and the historical data y. But usually people seem only care
about the updating of parameters. Dom Muston wrote as follows: "
You can't monitor y, but instead a statistic
derived from alpha and beta which can be interpreted as the mean of y
under the distributional assumptions you have made.
You should just add a line somewhere
meany < alpha/(alpha+beta)
and monitor the updating of this statistic" (Thanks, Dom)
Mr.LeBlond gave me a lot of precious suggestions which I really
appreciate. (See below for details.)We are looking forward to other
people's comments.
I think I need some time to digest all these info and would like to thank
again to everyone. This forum is so great.
Best regards,
Yongping
On Mon, July 17, 2006 2:30 pm, David LeBlond wrote:
> YongPing,
>
> One last note that might help.
>
> I was very curious to know if the sample produced for your y[14] was
> indeed that of the posterior predictive distribution. So I made a little
> test in WinBUGS using the simple univariate normal for which the posterior
> predictive distribution is known analytically (agrees with the frequentist
> result).
>
> For a normal sample of n values, the 95% confidence prediction interval
> (or in fact the 95% credible interval) for a single future value is
>
> x.bar +/ K*sd
>
> where K = t(0.975,n1)*sqrt(1+1/n).
>
> So we take a simple made up sample with n=3: y = (1,2,3),
>
> x.bar=2
> sd=1
> K=4.969
> and the interval is 2.969 to +6.969.
>
> Now if we try to reproduce this in WinBUGS using the following code.
> model
> {
> for (i in 1:N) {
> y[i] ~ dnorm(mu, tau)
> }
> mu ~ dnorm(0,0.0001)
> tau ~ dgamma(0.001,0.001)
> sigma< 1/sqrt(tau)
> }
>
> list(y=c(1,2,3,NA), N=4)#Data
>
> list(mu=2, tau=1) # intials (must use "gen inits" to get initial for y[4])
>
>
> we get the following posteriors:
> node mean sd MC error 2.5% median 97.5%
> start sample
> mu 2.004 1.985 0.01203 0.4789 1.995 4.496 1 38000
> sigma 1.776 3.148 0.03215 0.5184 1.203 6.315 1 38000
> tau 1.007 1.018 0.006009 0.02514 0.6904 3.723 1
> 38000
> y[4] 2.015 3.814 0.02379 3.025 1.999 6.923 1 38000
>
> and we note the agreement (within reason) of the credible interval (3.025
> to 6.923) with the theoretical result
>
> So I conclude that this "trick" of using a missing data value in WinBUGS
> does a good job of delivering a sample from the posterior predictive
> distribution.
>
> This is probably well known to many WinBUGS practitioners, but it is nice
> to see it confirmed by example. I can't find this described in any detail
> in the WinBUGS documentation or books that discuss WinBUGS applications,
> but it should be there since predictions are a key objective of many
> Bayesian analyses.
>
> Hope this helps,
>
> Dave LeBlond
> Sr. Statistician, NonClinical Statistics
> email: [log in to unmask]
> voice mail: 8479356031
> Dept R436, Bldg AP9A1
> Abbott Laboratories
> 100 Abbott Park Road
> Abbott Park, IL 600643500
>  Forwarded by David LeBlond/LAKE/PPRD/ABBOTT on 07/17/2006 02:10 PM
> 
>
> David LeBlond/LAKE/PPRD/ABBOTT
> 07/17/2006 10:22 AM
>
> To
> "Zhang, YongPing" <[log in to unmask]>
> cc
>
> Subject
> Re: help needed with basic bayesian updating
>
>
>
>
>
> Hi YongPing,
>
> Sure I would be very curious what input the group may have.
>
> Well I was able to get "something" to work. Below is the final code.
>
> model
> {
> for (i in 1:N) {
> y[i] ~ dbeta(alpha, beta)
> }
> alpha ~ dunif(1,100)
> beta ~ dunif(1,100)
> }
>
> list(y=c(0.001,0.0652,0.9778,0.3259,0.999,0.6519,0.3911,0.3911,0.3911,0.7823,0.9126,0.4237,0.9557,NA),
> N=14)#Data
>
> list(alpha=5, beta=18) # intials
>
> There were a number of changes I made. My attempts were not absolutely
> systematic, but at least it is something to work from. Another piece of
> the puzzle.
>
> First I transformed your data as you suggested y=(yrawmin)/(maxmin).
> Please verify my hand calculations. I was able to analyze the data without
> transforming first, but I could not obtain predicted y values without
> transforming first. It may be possible but I could not get it.
>
> In the data I made some substitutions because I think the beta does not
> actually support y=0.00 or y=1.00 (only in between) so I converted these
> values to 0.001 and 0.999 respectively.
>
> I substituted the uniform distribution as priors for alpha and beta. The
> reason is that the normal allows for beta<0 and alpha<0 and according to
> my text book, the parameters of the beta must be greater than zero (cannot
> be equal to zero).
>
> In order to obtain predictions for y, I included a missing observation
> (observation 14). I believe but am not positive that this is a way to
> obtain a posterior predictive distribution for new y values. It is a trick
> used in WinBUGS to model missing values by treating them basically as
> unknown parameters. I think this is what you are looking for, but it would
> be good to confirm this for yourself and perhaps a question to ask the
> WinBUGS group.
>
> Now since the missing obervation y[14] is actually a parameter it requires
> you supply an initial guess. I tried to do that by including "y[14]=0.5"
> in the initial guess list, but WinBUGS was not happy with that. I am sure
> that there is a way to specify an initial guess but I was not successful.
> So instead I just let WinBUGS supply the missing initial guess by clicking
> "get inits" in the model specification window after loading initial
> quesses for alpha and beta.
>
> I made a few iterations and below are the summary stats for alpha, beta
> and y[14]. Hopefully you can read the following table.
>
> node mean sd MC error 2.5% median 97.5% start
> sample
> alpha 1.213 0.1922 0.001586 1.007 1.168 1.671 1
> 17000
> beta 1.11 0.1033 8.212E4 1.003 1.081 1.38 1
> 17000
> y[14] 0.5188 0.2775 0.002079 0.0379 0.5231 0.9708 1
> 17000
>
> Notice that the alpha and beta are far from your original guesses (5 and
> 18 respectively). Perhaps your original guesses are based on your
> generalized beta.
>
> I am assuming (but you should confirm) that the y[14] is being sampled
> from the posterior predictive distribution for y.
>
> You might also ask the group if anyone knows how to implement the
> generalized beta for the likelihood as this would allow you to get
> predictions directly in your familiar units. Note that by transforming as
> I did, I am essentially treating some parameters of the gneralized beta as
> constant (ie the min and max) whereas in principal these are parameters
> that might have a distribution. I think recognizing this could lead to a
> more satisfying transformation than the one I used (ie substituting 0.001
> for 0.000 and 0.999 for 1.000 which is very unsettling thing to do). Of
> course too many parameters is also not a good thing so have to use
> judgement.
>
> The method I used here (using a missing data value to trick WinBUGS into
> providing a predictive posterior sample) seems pretty limiting. But I
> suppose that if you can get a posterior predictive distribution for the
> data, you could ask WinBUGS to give you a distribution for any function of
> the data too. This would allow you to back transform the y into your raw
> units (say by including a function like yraw<min + (minmax)*y[14] in
> your code) and get the predictive posterior for yraw. Not sure.
>
> Well please let me know how this goes for you. Best of Luck,
>
> Dave LeBlond
> Sr. Statistician, NonClinical Statistics
> email: [log in to unmask]
> voice mail: 8479356031
> Dept R436, Bldg AP9A1
> Abbott Laboratories
> 100 Abbott Park Road
> Abbott Park, IL 600643500
>
>
>
> "Zhang, YongPing" <[log in to unmask]>
> 07/14/2006 05:08 PM
>
> To
> "David LeBlond" <[log in to unmask]>
> cc
>
> Subject
> Re: help needed with basic bayesian updating
>
>
>
>
>
>
> Hi Mr.LeBlond,
>
> Thank you so much for your quick response. I got many replies from other
> people. Most of them pointed out my problem (blank between "list" and "(",
> improper beta distribution,and improper prior). However, no one else
> answered the questions in the application context. Can I submit our
> discussion to the forum so that other people can continue our discussion?
> You have a nice weekend.
>
> Best,
> Yongping
>
>
> On Fri, July 14, 2006 4:45 pm, David LeBlond wrote:
>> YongPing,
>>
>> Well I tried the following but I was not successful:
>>
>> model
>> {
>> for (i in 1:N) {
>> ytrans[i]<(y[i]1.0)/(8.671.0)
>> ytrans[i] ~ dbeta(alpha, beta)
>> }
>> alpha ~ dnorm(5.06,0.001)
>> beta ~ dnorm(18.59,0.001)
>> ytranspred~dbeta(alpha, beta)
>> ypred< 1.0+ytranspred*(8.671.0)
>> }
>>
>>
>>
> list(y=c(1.00,1.50,8.50,3.50,8.67,6.00,4.00,4.00,4.00,7.00,8.00,4.25,8.33),
>> N=13)#Data
>>
>> list(alpha=5, beta=18) # intials
>>
>> The ytrans[] variable is the transformed y[] which should follow the
> beta.
>> I think this is legal in WinBUGS to do as I did, but it may be that the
>> transformation has to be done to the values in the input data list
> itself
>> so that the model does not have to worry about this. Not sure.
>>
>> Then I tried to calculate the predicted value ypred by telling WinBUGS
>> that it would have a beta distribution. My hope was that it would use
> the
>> dbeta() request to produce random values (ie sampling  which is what
> it
>> does to create the likelihood anyway). However it may not be a legal
> thing
>> to do in WinBUGS. I would be surprised if this was illegal since
>> generation of the posterior prediction is an obvious thing to want to
> do.
>> There should be some facility to do this in WinBUGS. If I am wrong and
>> this is illegal and there is no other way to do it directly, you can
> still
>> get this by exporting your posterior for alpha and beta (via the CODA
>> button) and then sampling from the beta for each draw in R or Splus. But
> I
>> hope that there is some solution within WinBUGS itself. If so I would
> like
>> to hear about it.
>>
>> The above program will compile OK. The first sign of some problem comes
>> when reading in the initial guesses. There appears to be some additional
>> variables that WinBUGS needs initial guesses for besides alpha and beta
>> and I do not know what these variables are. Perhaps the ytrans[] or
>> ytranspred vairables need initial guesses. Anyway i just ask WinBUGS to
>> supply these initials whatever they are (generate initials) and this
> seems
>> to satisfy it.
>>
>> However there is an error that occurs when I try to generate samples.
>> Obviously there is some flaw in my logic or the functions are generating
>> some illegal values I am not sure which.
>>
>> I am sorry I cannot give you the final solution. I will take another
> look
>> next Monday and see if I can find some WinBUGS example where predicted
>> values were generated. I think this can be done if missing values are
>> included in the input data set. Then these become random nodes and
>> posteriors are generated. I think anyway. This might be one trick to get
>> it to work so you don't need the silly ypred variable.
>>
>> Have you gotten any other responses from the WINBUGS group? There are
>> likely many others more skilled than I. I know your situation. Likely
> the
>> Math/Stat dept at UIC has not yet gotten into the Bayesian viewpoint.
>> Scientists and engineers find the Bayesian view very obvious, but
>> statisticians are trained to think much differently  at least until
>> recently.
>>
>> Meanwhile happy computing. Don't give up. There has to be a way to make
>> this work!!! What you are doing seems very logical.
>>
>> Dave
>>
>>
>> Dave LeBlond
>> Sr. Statistician, NonClinical Statistics
>> email: [log in to unmask]
>> voice mail: 8479356031
>> Dept R436, Bldg AP9A1
>> Abbott Laboratories
>> 100 Abbott Park Road
>> Abbott Park, IL 600643500
>>
>>
>>
>> "Zhang, YongPing" <[log in to unmask]>
>> 07/14/2006 03:36 PM
>>
>> To
>> "David LeBlond" <[log in to unmask]>
>> cc
>>
>> Subject
>> Re: help needed with basic bayesian updating
>>
>>
>>
>>
>>
>>
>> Dear Mr.LeBlond,
>>
>> Thank you so much for your reply. I really appreciate your help.
>>
>> You are right. I had a blank between list and "(" and that's the
> problem.
>>
>> As for the improper use of beta distribution,I got it from Crystal
> Ball(a
>> risk analysis package). I double checked the definition in Crystal Ball
>> and found it is a generalized beta distribution. After transforming y by
>> (ymin)/(maxmin), all of the values of y will be in (0,1). I don't want
>> to easily give up the beta distribution, since it gives the best fit to
>> the data y. Is it OK to transform the new observations of y to be in
>> (0,1)?
>>
>> Yes, what I really want is the predictive posterior distribution of y.
> You
>> mentioned I can define a new node "ypred". I am not quite clear about
> it.
>> Where shall I put it? Can you please write it down in the model that I
>> listed(assuming that beta distribution is OK for y) ? Shall I add this
>> node together with the parameters in the "summary monitor"?
>>
>>
>> model
>> {
>> for (i in 1:N) {
>> y[i] ~ dbeta(alpha, beta)
>> }
>> alpha ~ dnorm(5.06,0.001)
>> beta ~ dnorm(18.59,0.001)
>> }
>>
>>
>> And I don't know what's the prior distributions of the
>> parameters(i.e.,alpha and beta here for beta distribtuion of y). All I
>> know is the historical data of y can also be fitted to a beta
> distribution
>> with parameters alpha=5,beta=18. So I assume
>> alpha~normal(5,0.001),beta~normal(18,0.001). Somebody also points out
> that
>> alpha and beta have to be positive and the normal assumptions are not
>> proper for them. But I don't what assumptions that I should make. Is
> there
>> much difference if I use different assumptions for alpha and beta? Are
>> there any guidelines of doing so?
>>
>> I am in Civil Engineering (Transportation Planning) program and very few
>> people in my field knows the details of Bayesian statistics. Although I
>> spent a lot of time into reading books and manuals, I am still in the
>> puzzle.
>>
>> Thank you very much.
>>
>> Regards,
>> Yongping
>>
>>
>>
>>
>>
>> On Fri, July 14, 2006 8:28 am, David LeBlond wrote:
>>> YongPing,
>>>
>>> In looking in more detail at your problem, it seems you may not want
>> just
>>> the posterior of alpha and beta, but you may want the predictive
>> posterior
>>> distribution for y. That should also be available from WinBUGS. You
> just
>>> have to define a new node (random variable, say "ypred") that has the
>>> distribution you expect for y (Is it poisson?).
>>>
>>> You still need to redefine the data model for y. It cannot be beta. It
>>> sounds like it should be poisson but that is discrete so I am not sure
>>> what the right distribution for y should be. For some reason it seems
> to
>>> me it should be exponential, but ???
>>>
>>> Dave LeBlond
>>> Sr. Statistician, NonClinical Statistics
>>> email: [log in to unmask]
>>> voice mail: 8479356031
>>> Dept R436, Bldg AP9A1
>>> Abbott Laboratories
>>> 100 Abbott Park Road
>>> Abbott Park, IL 600643500
>>>
>>>
>>>
>>> "Zhang, YongPing" <[log in to unmask]>
>>> Sent by: "(The BUGS software mailing list)" <[log in to unmask]>
>>> 07/14/2006 12:54 AM
>>> Please respond to
>>> "Zhang, YongPing" <[log in to unmask]>
>>>
>>>
>>> To
>>> [log in to unmask]
>>> cc
>>>
>>> Subject
>>> help needed with basic bayesian updating
>>>
>>>
>>>
>>>
>>>
>>>
>>> Dear All,
>>>
>>> I am wondering if WinBUGS is the one that I can use to do the very
> basic
>>> Bayesian updating? I mean something like "if the prior distribution of
> a
>>> parameter is gamma, and new data has poisson distribution, then the
>>> posterior distribution of the parameter is another gamma". However,in
>> all
>>> of the examples of WinBUGS, people are trying to model the relationship
>>> between a parameter theta(which is a parameter in the distribution of a
>>> variable y) and other variables x. In my case, I just want to update
>> the
>>> parameter without complex modelling. In the following model I have
>> passed
>>> the "check model" but when I load the data, it shows "expected variable
>>> name".
>>> Context: here y is a variable called "trip distance per person per
>>> day".Suppose I have old data yi, from which Crystal Ball gives a good
>>> fitting distribution of beta(alpha=5.06,beta=18.59).Based on it,I
> assume
>>> the prior distributions of alpha is normal(5.06,0.001) and
>>> beta~normal(18.59,0.001). Now I have some new data yi(new),which I
>> suppose
>>> comes from a population of another beta distribution (i.e.,the function
>> to
>>> build likelihood). I want to know what the updated distributions of
>> alpha
>>> and beta are. Can I use the means of updated alpha and beta to get the
>>> updated distribution of variable y.
>>>
>>> Your help to me is highly appreciated.
>>>
>>> Regards,
>>> Yongping Zhang
>>>
>>>
>>> model
>>> {
>>> for (i in 1:N) {
>>> y[i] ~ dbeta(alpha, beta)
>>> }
>>> alpha ~ dnorm(5.06,0.001)
>>> beta ~ dnorm(18.59,0.001)
>>> }
>>>
>>> list
>>> (y=c(1.00,1.50,8.50,3.50,8.67,6.00,4.00,4.00,4.00,7.00,8.00,4.25,8.33),
>>> N=13)#Data
>>>
>>> list (alpha=5, beta=18) # intials
>>>
>>> 
>>> This list is for discussion of modelling issues and the BUGS software.
>>> For help with crashes and error messages, first mail
>>> [log in to unmask]
>>> To mail the BUGS list, mail to [log in to unmask]
>>> Before mailing, please check the archive at
>>> www.jiscmail.ac.uk/lists/bugs.html
>>> Please do not mail attachments to the list.
>>> To leave the BUGS list, send LEAVE BUGS to [log in to unmask]
>>> If this fails, mail [log in to unmask], NOT the whole list
>>>
>>>
>>
>>
>>
>
>
>

This list is for discussion of modelling issues and the BUGS software.
For help with crashes and error messages, first mail [log in to unmask]
To mail the BUGS list, mail to [log in to unmask]
Before mailing, please check the archive at www.jiscmail.ac.uk/lists/bugs.html
Please do not mail attachments to the list.
To leave the BUGS list, send LEAVE BUGS to [log in to unmask]
If this fails, mail [log in to unmask], NOT the whole list
