Print

Print



Dear Klaas and Ian,

Although I’m not an expert in DCM, I just want to make clear the rationale of our procedure (in Seghier and Price 2009 Cerebral Cortex) because I feel Ian’s summary is little bit misleading. First, this work, the first application of random BMS procedure using the negative free energy as a measure of model evidence, was carried out before the emergence of the concept of family models in the DCM community (i.e. adding another hierarchical level to the random group BMS). As Klaas mentioned in his email, the BMS procedure is model-space dependent and the output values should be considered in a “relative” manner. However, (i) it is very rare to have only one model that dominates all the remaining models when the model space is large, and thus an objective criterion to select the best models is not an easy thing to define (e.g. a threshold on the exceedence probability (xp)?); (ii) a model that has the highest xp in a large model space does not necessarily mean that this model beats all the other models. This is due to the “relative” nature of the BMS, so we cannot rule out that higher evidence for a given model can be caused by the existence of other irrelevant and implausible models (e.g. if you have some models that are badly defined or empirically implausible, their influence will be very high). For instance, in another set of 128 models that I tested previously, the best model had an xp of 12% and the second best model had an xp of 9%; however if these two models were confronted, the second model was by far the best one. This is again due to different model spaces in BMS (I guess this is comparable to the influence of data points selection in multivariate analysis).
There are other “problems” and this why we preferred in our paper to perform the BMS in 2 steps:
1- At the family level: compare each model across different families (called configurations in our paper). This will answer the question: given a particular model, what is the best family/configuration (e.g. additional modulations or different driving regions). For each model, we defined the best family/configuration with an xp>90%. All the configurations that are implausible will thus be discarded for the next step.
2- At the model level: we selected our six models in their winning families and compare them with BMS. We then performed a pairwise BMS comparison of each model versus the other models as a complementary analysis to ensure that the best models still have higher evidence when compared within a limited model space (without the irrelevant models/families from step 1).

Two conceptual points are critical in this two-step BMS:
1- is it valid in the second step to limit the model space to the most plausible models? We can ask the question differently: what is the point of keeping models that have very weak evidence (xp<1%)?
2- is this implication true?: if a model has by far a high evidence (e.g. xp > 90%) when compared to each of the other models (in a pairwise comparison with BMS), the same model will have high evidence when using BMS on the whole model space (the same all models).

I will be happy to have your opinion on these two points…


Having said that, and because this kind of question is becoming more popular in DCM, I believe it is more appropriate to use the new DCM family comparison procedure "spm_compare_families" (as Klaas mentioned in his previous emails) that will hopefully be available soon in SPM8.

I hope this helps,

Mohamed





Klaas Enno Stephan wrote:
[log in to unmask]" type="cite">
Dear Ian,

I have not yet read the paper you mention and thus cannot comment on the particular approach chosen.  Generally though, I would recommend to apply the random effects BMS method to all models considered in one go and not perform selective comparisons between model subsets that do not, as a union, cover the entire model space considered.  As the posterior model probabilities (and exceedance probabilities) are conditional on the model space chosen, such selective comparisons can yield contradictory results.

Your main question is how you can quantify the superiority of a model chosen by the random effects BMS procedure in relation to all other models considered.  In the fixed effects setting, this issue is simple because one can simply compute pairwise group Bayes factors.  In the random effects setting, the numerical values of individual posterior model probabilities (as well as exceedance probabilities) decrease with the number of models considered (because both have to sum to zero).  When dealing with large numbers of models, the winning model may well have a "low" posterior model probability of 0.1 or less.  Some users feel uncomfortable with such numbers that they subjectively perceive to be "too low" to be convincing.  Importantly, however, it is not the absolute probability that matters but the relative (compared to all other models considered).  For example, a winning model may only have a posterior model probability of 0.1, but this may still represent an impressive superiority if the next best model has a posterior model probability that is 20 times smaller.  In analogy to Bayes factors, one could thus compute ratios of posterior model probabilities to quantify how much better a particular model is at the group level, compared to all others.  (You could do exactly the same with exceedance probabilities.  I suspect, however, that most readers will find posterior model probabilities easier to understand.)

Does this help?

Best wishes,
Klaas




Von: Ian Ballard <[log in to unmask]>
An: Klaas Enno Stephan <[log in to unmask]>
Gesendet: Dienstag, den 21. Juli 2009, 19:57:53 Uhr
Betreff: Re: [SPM] DCM BMS results interpretation

Dr. Stephan,
I have a question about the new random effect BMS method.  How does one decide if the first model is significantly better than the second best?  I compared 64 models, and since each 'used up' some of the exceedence probability, and because 4 of the models were quite similar and all had relatively high exceedence probabilities, I have an e.p. of only .1 for the best model.  I reran BMS with the first versus second best model, and got an e.p. of .85 .  In  Reading Aloud Boosts Connectivity through the Putamen by Seghier and Price (the only paper i've found that uses the new method), they presented a pairwise comparison of the 6 best models.  Is this correct?  As I understand, it is not valid compare 2 models with the new method and make inferences about the entire model space.    I also thought it may be valid to do model space partitioning based on the modulatory effect that varies between my best and second best model, and use that to infer which model is superior.  Any help would be greatly appreciated.
Thank you,
Ian





On Jul 10, 2009, at 11:39 AM, Klaas Enno Stephan wrote:

Dear Matthias,

Model space partitioning is an attractive approach because it allows you to selectively examine the importance of a particular model component by „integrating out“ any other aspect of model structure.  One limitation of the present random effects BMS method is, however, that one cannot compare model families (subsets) that contain different numbers of models.  This means that you cannot apply model space partitioning to your particular question as the families you wish to compare are of unequal size (16 versus 48 models).  However, Will Penny has recently developed an extension to our random effects BMS approach which does allow for such comparisons.  Once it is fully tested and validated, this extension will be available in one of the future updates for SPM.  

Best wishes,
Klaas




Von: Matthias Schurz <[log in to unmask]>
An: [log in to unmask]
Gesendet: Freitag, den 10. Juli 2009, 08:55:56 Uhr
Betreff: [SPM] DCM BMS results interpretation

Dear DCM experts,

i have some interesting DCM BMS results from the new RFX method (SPM8) - see
the attached pdf file
i would be happy about any suggestions/comments how to best interpret them!

it looks like a number of models (16 models) are explaining the data much
better than the rest (48 models). 
interestingly, these 16 best models share two particular model attributes. 
do you think that it is apporpriate to do a model space partitioning as
described in Stephan et al. (2009)? 
or would you rather report the model with the highest exceedance probability?

thank you very much,
matthias