Hi Helmut,
Thanks for your reply. Just a couple of follow up questions:
> to start with, a post-hoc test should be based on voxels/clusters that showed up significantly in the interaction.
Isn't it a case of double dipping to limit the search to the area(s) resulting from the F-test?
> If you didn't find anything for that contrast, then there is no justification for post-hoc tests, because there's nothing to explain. Of course it is possible to conduct pairwise comparisons nonetheless, but this would correspond to planned comparisons (a priori). It would then depend on your hypotheses how to conduct these planned comparisons (did you expect some effects for a particular region as such = ROI analysis, some effect within a particular region = SVC, .... between group A1 and A2, ...).
What would be the best approach in imaging?
- ANOVA followed by whole-brain post-hoc t-tests?
- Or just whole-brain "planned t-tests"?
What corrections need to be applied for each step?
Is there anything wrong with whole-brain "post-hoc" tests?
Let's say, I use an initial threshold of p=0.005 uncorrected for my 2x2 ANOVA, to which I apply FWE correction at the cluster level. That would mean I need to use an initial uncorrected threshold of p=0.0008 (0.005/6) for any following (whole-brain) post-hoc comparison, in addition to correction for multiple comparison at the cluster level? This seems to be really really conservative, isn't it? At least really strong compared to whole-brain "planned t-tests" (that will be estimated at p=0.005 uncorrected + FWE corrected at the cluster level for each comparison), which are basically the same tests!!
[This is if we consider to estimate separate 2-sample t-tests for each comparison, outside the ANOVA model. By the way, is that a correct approach?]
Is there a standard procedure for F- followed by T-tests I missed somewhere?
Thanks again for your helpful advice, and sorry if my message seems confusing.
Cheers,
Yann
On 22 Jul 2014, at 9:12 pm, Helmut Nebl <[log in to unmask]> wrote:
> Dear Yann,
>
> to start with, a post-hoc test should be based on voxels/clusters that showed up significantly in the interaction. If you didn't find anything for that contrast, then there is no justification for post-hoc tests, because there's nothing to explain. Of course it is possible to conduct pairwise comparisons nonetheless, but this would correspond to planned comparisons (a priori). It would then depend on your hypotheses how to conduct these planned comparisons (did you expect some effects for a particular region as such = ROI analysis, some effect within a particular region = SVC, .... between group A1 and A2, ...).
>
> Concerning the thresholds and multiple testing, in general people do not correct for the number of conducted tests ("specified contrasts") when it comes to fMRI. For example it is common to run two one-sided t-tests A > B and B > A with an initial voxel threshold of .001 uncorrected instead of these two tests thresholded at .001/2 each. One might argue that due to the rather conservative .001 (conservative at least compared to .05 uncorrected) it is not that important to correct for the number of tests, but this is a weak argument IMO. Another frequent statement is that pairwise comparisons are more sensitive than a single F contrast. Well, it doesn't really have to do anything with sensitivity, if you adjust for the number of pairwise comparisons (which one should), then it doen't matter. If you do not adjust then of course one can expect more significant findings.
>
> Now, if you want to run post-hoc tests taking into account the number of conducted tests you would have to adjust the initial voxel threshold like .001 / n, because this threshold defined whether a voxel shows up in the interaction or not. The cluster threshold is irrelevant in that context, but as stated, you should run post-hoc tests for sig. clusters only.
>
> Finally, you have to think of how to implement these post-hoc tests. Based on the beta estimates of the peak voxels? Based on the cluster-averaged beta estimates? Based on the eigenvariate? In any case you have to make sure that you still rely on the same initial voxel threshold, e.g. .001. Post-hoc tests are often said to reveal a significant difference with p values falling between .001 and .05. In these cases the post-hoc comparison is actually NOT significant, as you should not switch to a more liberal threshold just for post-hoc tests. If you still want to do so, you should explicitely state that you use a more liberal threshold for post-hoc tests. It might be better to go with something like "trend", but this is another critical issue, see http://mchankins.wordpress.com/2013/04/21/still-not-significant-2/ for a humoristic perspective.
>
> Best,
>
> Helmut
>
>
>
|