Print

Print


IAIN T JOHNSTONE wrote:
> > Just to be a little more concrete, if I see a dozen low-powered
> > studies with similar task contrasts, the thresholded maps make it easy
> > for me to make generalizations such as "motor planning almost always
> > activates the SMA, but never visual cortex."
>
> I'm afraid I disagree strongly with this line of reasoning. Thresholded
> maps don't allow any conclusions to be drawn about areas that were not
> significantly activated, other than they failed to reach significance.
> To conclude that an area is never activated because it doesn't show up
> on thresholded maps is incorrect, and I would say is exactly the problem
> with thresholded maps - they tempt people to affirm the NULL without
> proper justification. Without at least a power analysis, based upon
> reasonable interval predictions (x % signal change, although deciding on
> x for fMRI studies is quite tricky), one is really in no position to
> conclude anything at all about brain regions that show no significant
> activation.

I didn't mean to imply that we could affirm the null hypothesis by
looking at maps.  But there is a big difference between taking a null
finding to affirm the null and considering a null finding informative,
even in the absence of a power analysis.

Let me try to argue this a different way.  If you observe that across
50 studies of the difference between two task conditions, analyzed in
similar ways, some particular area A always exceeds some threshold and
some particular other area B never does, then you can generalize about
differences between activation in these locations based on this
knowledge.  Of course you wouldn't be justified in claiming that there
is no true difference between conditions in area B.  But you would be
justified in predicting that in the next similar dataset, area A would
again be more reliably active than area B.  The degree of
justification for this prediction would depend on how many studies you
observed and how consistent the pattern was.  You will of course miss
many true differences between regions, and it could be argued that
thresholding to control the FWE is a particularly bad choice if this
is how you want to proceed.  But thresholding does restrict the number
of hypotheses you're willing to consider in a useful way (bearing in
mind that the studies in question were not designed to measure
differences between regions).  Of course, best would be if the authors
of some of the studies shared your interests and could have carried
out the relevant comparisons directly.

Note that this doesn't involve any commitment to the null hypothesis
being true in B.  Even so, knowing that B does not exceed threshold in
any of the studies is still informative.  If instead of saying that B
failed to exceed threshold I told you that we didn't collect data in
B, you wouldn't be justified in drawing the same conclusions.  So in
that sense, I consider the data informative.  We still need to
consider reasons why A may have been "active" (i.e., supra-threshold)
more often, including the possibility that our sensitivity in B is
much lower for trivial reasons.

As an aside, this kind of eyeball meta-analysis, which we all do to
some extent, doesn't even depend on having a formal definition for the
regions, perfect co-registration, etc.  It's obviously possible to
argue about whether it's best done with thresholded or unthresholded
maps.  I lean a bit towards thresholded, because I'm reluctant to
encourage presenting data that I know is basically all noise,
especially when large smoothing kernels make it look patternful.

dan