I admit that the case where one defines the region of interest using
the same contrast one is using for the inference is so perverse I did
not think of it. Maybe that one can find a way in which this test
makes sense, as Tom does, but definitely it's nothing of practical
I would argue that conditionality is the appropriate concept here,
since it requires one to make assumptions explicit. It makes no sense
to quarrel about the usage of the word 'bias', but I urge Stephen to
consider the idea that specifying the assumptions and the condition
under which an inference holds is considerably more precise and may
well include what he means by this word.
I believe that Doug Burman raises an important point regarding
orthogonality of contrasts. Assuming a normal distribution of the
errors, then it is indeed the case that the two inferences are
independent, since the underlying test statistics are independent. It
is not an unusual case: the individual differences example I made in
my original post is another example of such a pair of orthogonal
The distinction between dependent and independent variables (the point
raised from Stephen in the latest post) is not so clear-cut when we
are dealing with testing families, but this a rather involved issue.
Voxel-level corrected significance values, the gold standard of
inference in neuroimaging, are defined conditionally on the subset of
voxels where the null hypothesis holds. Thus, they are conditional on
a quality of the dependent variable (what Stephen finds
objectionable). Having said this, the SPM strategy of using random
fields to derive the rejection region always assumes the worst case
that the null holds over the whole volume, so that the conditionality
is no longer present. However, the whole literature on repeated
testing and strong control (such as the relevant parts of Hochberg and
Tamhane's book) thrives on the opportunity offered by this type of
conditionality on the dependent variable (through multi-step testing),
which shows that many statisticians find it ok. In the functional ROI
case, we do the same thing: prune the testing family on the basis of
properties of the dependent variable.
It seems that Stephen needs a test which the conditionality does not
include the selection of the subjects on which the second test is
carried out. That's fine, it's quite possible that this may be needed
for some specific problem. I suspect that the formal definition of the
inferential differences that he lists in his latest post would be a
challenging task. Be it as it may, I find the functional ROI
definition for pairs of orthogonal contrasts logically clear and
unobjectionable in the appropriate context.
Stepwise regression (which isn't the same as multi-step tests) has the
purpose of modelling the data, not generating valid p values, so is
misused in the example of Stephen's post. (It makes a difference,
though, if you are selecting on the nuisance covariates only). To
obtain valid p values here, you need to include all considered models
in a testing family, and derive the appropriate correction.
Quoting "Fromm, Stephen (NIH/NIMH) [C]" <[log in to unmask]>:
> " 'In regions showing greater activation for angry and happy faces than
> for dots, we found angry faces to produce greater activation than happy
> faces'. This would be fine I think."
> I don't think this example gets at the full story, because the statement
> could refer to inferences using two independent sets of functional data
> to define regions; using the same set of data; or using a priori
> anatomical regions. Of course, one could claim it's clear from context
> in a given publication, but readers will typically unconsciously extend
> the inference from the case at hand (dependent data) to the "stronger"
> case (independent data). Furthermore, putting aside cases where e.g.
> orthogonality obtains (as raised by Doug Burman), it's not at all
> obvious what the implications of the inference from dependent data are
> for the truly unbiased inference from independent data. And the latter
> is really what we're after.
> Note that I'm by no means claiming that your suggestion (or more
> generally the use of functionally defined regions) is at odds with
> standard accepted practice in the neuroimaging community.
> -----Original Message-----
> From: Tom Johnstone [mailto:[log in to unmask]]
> Sent: Thursday, October 02, 2008 9:25 AM
> To: Fromm, Stephen (NIH/NIMH) [C]
> Cc: [log in to unmask]
> Subject: Re: [SPM] Multi-masking for Multiple Comparison Correction
> I'm actually with Roberto on this one. In all these cases, we're using
> inferential statistics. The validity of the *inference* that we make
> based upon the statistics is the important thing in this case. If we
> clearly state that the inference is limited by the apriori
> assumption/condition that we made, then we shouldn't have a problem.
> Take the trivial case of a functionally defined mask based on a
> contrast A that is used to mask the same contrast, as mentioned by
> Stephen. Obviously the inference "in this region of the brain, A was
> significant" would be non-valid. But if instead we made the inference
> "in regions of the brain where A was significant we were able to show
> that A was significant" we would be absolutely fine, statistically and
> inferentially speaking, though a reviewer would question our sanity in
> finding it worthwhile to report.
> A more realistic example: I perform a study in which I show people
> angry and happy faces and black dots. I define a contrast face-dot and
> find regions of the brain showing this effect. I use those regions as
> a functionally-defined ROI and test the angry-happy contrast. What
> inferences can I make? "In regions showing greater activation for
> angry and happy faces than for dots, we found angry faces to produce
> greater activation than happy faces". This would be fine I think. But
> it would be wrong to drop the first part of that statement.
> On Thu, Oct 2, 2008 at 1:56 PM, Fromm, Stephen (NIH/NIMH) [C]
> <[log in to unmask]> wrote:
>> Sorry if this isn't addressing your points; the original poster's
>> question was a little unclear to me, because I wasn't 100% sure what
>> meant by "multi-masking."
>> All I mean is that if you use a mask defined by the same functional
>> that you're applying the mask to, the significance will possibly be
>> As for definitions, an example which is somewhat conceptually related
>> stepwise regression. The paper at
>> states, "When model modifications are selected using post-hoc
>> information (e.g., in stepwise regression) standard estimates of
>> p-values become biased." Ultimately, I think my use of "bias" here is
>> correct, based on definitions given at Wikipedia.
>> So, what I'm saying here is that the use of a functionally defined
>> can lead to corrected p-values which are likely to be too small. The
>> simplest example is using a contrast to mask itself. The uncorrected
>> p-values are obviously unaffected by this procedure. And the
>> p-values are obviously decreased.
>> My comment "the mathematics dictates that there is no bias": I mean
>> that I assume that there are situations where there's enough
>> independence (e.g, perhaps between the masking contrast and the
>> you're masking) that the bias either doesn't exist or is probably
>> negligible, but I haven't had time to think up rigorous examples.
>> Best regards,
>> -----Original Message-----
>> From: [log in to unmask] [mailto:[log in to unmask]]
>> Sent: Thursday, October 02, 2008 8:16 AM
>> To: Fromm, Stephen (NIH/NIMH) [C]
>> Cc: [log in to unmask]
>> Subject: Re: Multi-masking for Multiple Comparison Correction
>> Could you be more specific? I can't see what you mean by "the
>> mathematics dictates that there is no bias". It's important to avoid
>> misunderstandings about the terminology: bias is a technical term,
>> defined on the power function of the test, and does not mean just
>> wrong in some way. You should be sure that when you mention bias you
>> do not mean "conditional on the functional data", as I mentioned in my
>>> Except in certain circumstances, where you could show that the
>>> dictates that there's no bias, defining regions based on the
>> functional data
>>> itself can definitely bias results, regardless of whether the
>>> contrast is defined
>>> a priori.
>>> Perhaps one can argue that the bias is slight; and it's certainly
>>> practice in the neuroimaging community. But, again, procedures that
>> look to
>>> the data can lead to bias.
>>> Of course, if one uses separately acquired data to create the
>>> defined ROI, that's a different matter.
>>>> In some specific instance, using the mask approach follows a clear
>>>> substantive logic. For example, if you are investigating individual
>>>> differences in cognitive capacity, you may be justified in carrying
>>>> out a contrast first, and then look at how individual differences
>>>> modulate the activation say, in prefrontal and parietal areas.
>>>> You do have to pay for the increased power (if the procedure is
>>>> a priori), the price being that you potentially miss an effect in
>>>> voxels outside the mask.
>>>> I do not see any simple way in which the concept of bias relates to
>>>> this specific situation; I'd rather say that these tests are
>>>> conditional on the a priori criterion. If the criterion is not a
>>>> priori, they have wrong significance values (too small), with
>>>> type I errors.
>>>> When you use a cluster approach, you also have to specify a priori a
>>>> cluster definition threshold. Your p values are conditional on this
>>>> threshold. If you try several thresholds, your test will have wrong
>>>> All the best,
>>>> Roberto Viviani
>>>> University of Ulm, Germany
>>>> Quoting Amy Clements <[log in to unmask]>:
>>>>> Dear Experts,
>>>>> I am pretty far away from having statistical expertise, which is
>>>>> I am posing my question to the group. Recently, I have seen a
>>>>> multitude of papers that are using a multi-masking approach to deal
>>>>> with corrections for multiple comparisons (using main effect or
>>>>> other effects of interest contrasts masks). While on the surface
>>>>> this appears to seem like an optimal approach because you are
>>>>> restricting the number of voxels included in the multiple
>>>>> comparison, it seems like an opportunity for biasing the data and
>>>>> obtained results--especially if you are not masking the data based
>>>>> from a priori hypotheses (e.g., using a previously defined
>>>>> functional ROI mask because you're interested in face processing).
>>>>> I'm not sure that I've articulated this is the best way. It seems,
>>>>> like I mentioned previously, to have the potential to bias results,
>>>>> but would greatly appreciate feedback. The questions typically
>>>>> asked from the lab that I've worked in have been better suited to
>>>>> utilizing a cluster-based approach; however, could also be served
>>>>> Amy Stephens
>>>>> The materials in this e-mail are private and may contain Protected
>>>>> Health Information. Please note that e-mail is not necessarily
>>>>> confidential or secure. Your use of e-mail constitutes your
>>>>> acknowledgment of these confidentiality and security limitations.
>>>>> you are not the intended recipient, be advised that any
>>>>> unauthorized use, disclosure, copying, distribution, or the taking
>>>>> of any action in reliance on the contents of this information is
>>>>> strictly prohibited. If you have received this e-mail in error,
>>>>> please immediately notify the sender via telephone or return
> School of Psychology and CLS
> University of Reading
> 3 Earley Gate, Whiteknights
> Reading RG6 6AL, UK
> Ph. +44 (0)118 378 7530
> [log in to unmask]