"I admit that the case where one defines the region of interest using
the same contrast one is using for the inference is so perverse I did
not think of it."
There's nothing perverse about it; such reasoning is standard in
mathematics. It shows that, generically, such procedures lead to
artificially small corrected p-values. Clearly, based on continuity
considerations, if you perturb away from this case, you'll still have
two contrasts which are not the same, for which the invalidity of the
corrected p-values still holds.
"I would argue that conditionality is the appropriate concept here,
since it requires one to make assumptions explicit."
I'm arguing that generically invoking conditionality, without care (such
as attending to conditions like orthogonality), is an inappropriate
attempt to finesse the problem.
"Assuming a normal distribution of the errors, then it is indeed the
case that the two inferences are independent, since the underlying test
statistics are independent."
If I recall correctly, even this isn't strictly true. Rather, I thought
it shows that orthogonality of contrasts implies the tests are nearly
independent, but not exactly so. But I might be recalling this
incorrectly and don't have time to research the issue right now. (I
would agree that nearly independent is good enough here, however. My
point is that the math behind all this is far from obvious.)
"However, the whole literature on repeated testing and strong control
(such as the relevant parts of Hochberg and Tamhane's book) thrives on
the opportunity offered by this type of conditionality on the dependent
variable (through multi-step testing), which shows that many
statisticians find it ok."
No. That shows that they find it OK when it's done with rigor, which
isn't the case generally. I don't have access to that text, but
presumably the content is nontrivial, precisely because the appropriate
considerations are involve calculations which aren't entirely trivial.
"In the functional ROI case, we do the same thing: prune the testing
family on the basis of properties of the dependent variable."
Which can lead to misleading inferences if not done with care.
"Be it as it may, I find the functional ROI definition for pairs of
orthogonal contrasts logically clear and unobjectionable in the
appropriate context."
I agree with that, but I'd wager that if you look at the neuroimaging
literature, not all such pairs of contrasts are orthogonal or even
nearly orthogonal. I'd wager with certainty that many investigators
aren't aware of the issue.
"Stepwise regression (which isn't the same as multi-step tests) has the
purpose of modelling the data, not generating valid p values, so is
misused in the example of Stephen's post."
You misunderstood me. Of course it's true that stepwise regression
isn't inappropriate if used for modeling the data. (Similarly, my
entire critique fails if functionally defined regions are used for
exploring the data and not making claims which are truly inferential.)
The point is that, historically, like many statistical tools it's been
abused, and people have used it to generate what they thought were valid
p-values. I don't have examples of such misuse, but knowledgeable
statisticians have definitely discussed stepwise regression in that
context.
Best,
S
-----Original Message-----
From: [log in to unmask] [mailto:[log in to unmask]]
Sent: Thursday, October 02, 2008 12:01 PM
To: Fromm, Stephen (NIH/NIMH) [C]
Cc: [log in to unmask]
Subject: Re: Multi-masking for Multiple Comparison Correction
I admit that the case where one defines the region of interest using
the same contrast one is using for the inference is so perverse I did
not think of it. Maybe that one can find a way in which this test
makes sense, as Tom does, but definitely it's nothing of practical
value.
I would argue that conditionality is the appropriate concept here,
since it requires one to make assumptions explicit. It makes no sense
to quarrel about the usage of the word 'bias', but I urge Stephen to
consider the idea that specifying the assumptions and the condition
under which an inference holds is considerably more precise and may
well include what he means by this word.
I believe that Doug Burman raises an important point regarding
orthogonality of contrasts. Assuming a normal distribution of the
errors, then it is indeed the case that the two inferences are
independent, since the underlying test statistics are independent. It
is not an unusual case: the individual differences example I made in
my original post is another example of such a pair of orthogonal
contrasts.
The distinction between dependent and independent variables (the point
raised from Stephen in the latest post) is not so clear-cut when we
are dealing with testing families, but this a rather involved issue.
Voxel-level corrected significance values, the gold standard of
inference in neuroimaging, are defined conditionally on the subset of
voxels where the null hypothesis holds. Thus, they are conditional on
a quality of the dependent variable (what Stephen finds
objectionable). Having said this, the SPM strategy of using random
fields to derive the rejection region always assumes the worst case
that the null holds over the whole volume, so that the conditionality
is no longer present. However, the whole literature on repeated
testing and strong control (such as the relevant parts of Hochberg and
Tamhane's book) thrives on the opportunity offered by this type of
conditionality on the dependent variable (through multi-step testing),
which shows that many statisticians find it ok. In the functional ROI
case, we do the same thing: prune the testing family on the basis of
properties of the dependent variable.
It seems that Stephen needs a test which the conditionality does not
include the selection of the subjects on which the second test is
carried out. That's fine, it's quite possible that this may be needed
for some specific problem. I suspect that the formal definition of the
inferential differences that he lists in his latest post would be a
challenging task. Be it as it may, I find the functional ROI
definition for pairs of orthogonal contrasts logically clear and
unobjectionable in the appropriate context.
Stepwise regression (which isn't the same as multi-step tests) has the
purpose of modelling the data, not generating valid p values, so is
misused in the example of Stephen's post. (It makes a difference,
though, if you are selecting on the nuisance covariates only). To
obtain valid p values here, you need to include all considered models
in a testing family, and derive the appropriate correction.
Cheers,
Roberto
Quoting "Fromm, Stephen (NIH/NIMH) [C]" <[log in to unmask]>:
> " 'In regions showing greater activation for angry and happy faces
than
> for dots, we found angry faces to produce greater activation than
happy
> faces'. This would be fine I think."
>
> I don't think this example gets at the full story, because the
statement
> could refer to inferences using two independent sets of functional
data
> to define regions; using the same set of data; or using a priori
> anatomical regions. Of course, one could claim it's clear from
context
> in a given publication, but readers will typically unconsciously
extend
> the inference from the case at hand (dependent data) to the "stronger"
> case (independent data). Furthermore, putting aside cases where e.g.
> orthogonality obtains (as raised by Doug Burman), it's not at all
> obvious what the implications of the inference from dependent data are
> for the truly unbiased inference from independent data. And the
latter
> is really what we're after.
>
> Note that I'm by no means claiming that your suggestion (or more
> generally the use of functionally defined regions) is at odds with
> standard accepted practice in the neuroimaging community.
>
> Cheers
>
> -----Original Message-----
> From: Tom Johnstone [mailto:[log in to unmask]]
> Sent: Thursday, October 02, 2008 9:25 AM
> To: Fromm, Stephen (NIH/NIMH) [C]
> Cc: [log in to unmask]
> Subject: Re: [SPM] Multi-masking for Multiple Comparison Correction
>
> I'm actually with Roberto on this one. In all these cases, we're using
> inferential statistics. The validity of the *inference* that we make
> based upon the statistics is the important thing in this case. If we
> clearly state that the inference is limited by the apriori
> assumption/condition that we made, then we shouldn't have a problem.
>
> Take the trivial case of a functionally defined mask based on a
> contrast A that is used to mask the same contrast, as mentioned by
> Stephen. Obviously the inference "in this region of the brain, A was
> significant" would be non-valid. But if instead we made the inference
> "in regions of the brain where A was significant we were able to show
> that A was significant" we would be absolutely fine, statistically and
> inferentially speaking, though a reviewer would question our sanity in
> finding it worthwhile to report.
>
> A more realistic example: I perform a study in which I show people
> angry and happy faces and black dots. I define a contrast face-dot and
> find regions of the brain showing this effect. I use those regions as
> a functionally-defined ROI and test the angry-happy contrast. What
> inferences can I make? "In regions showing greater activation for
> angry and happy faces than for dots, we found angry faces to produce
> greater activation than happy faces". This would be fine I think. But
> it would be wrong to drop the first part of that statement.
>
> -Tom
>
> On Thu, Oct 2, 2008 at 1:56 PM, Fromm, Stephen (NIH/NIMH) [C]
> <[log in to unmask]> wrote:
>> Roberto,
>>
>> Sorry if this isn't addressing your points; the original poster's
>> question was a little unclear to me, because I wasn't 100% sure what
> she
>> meant by "multi-masking."
>>
>> All I mean is that if you use a mask defined by the same functional
> data
>> that you're applying the mask to, the significance will possibly be
>> inflated.
>>
>> As for definitions, an example which is somewhat conceptually related
> is
>> stepwise regression. The paper at
>> http://publish.uwo.ca/~harshman/ssc2006a.pdf
>> states, "When model modifications are selected using post-hoc
>> information (e.g., in stepwise regression) standard estimates of
>> p-values become biased." Ultimately, I think my use of "bias" here
is
>> correct, based on definitions given at Wikipedia.
>>
>> So, what I'm saying here is that the use of a functionally defined
> mask
>> can lead to corrected p-values which are likely to be too small. The
>> simplest example is using a contrast to mask itself. The uncorrected
>> p-values are obviously unaffected by this procedure. And the
> corrected
>> p-values are obviously decreased.
>>
>> My comment "the mathematics dictates that there is no bias": I mean
>> that I assume that there are situations where there's enough
>> independence (e.g, perhaps between the masking contrast and the
> contrast
>> you're masking) that the bias either doesn't exist or is probably
>> negligible, but I haven't had time to think up rigorous examples.
>>
>> Best regards,
>>
>> S
>>
>> -----Original Message-----
>> From: [log in to unmask] [mailto:[log in to unmask]]
>> Sent: Thursday, October 02, 2008 8:16 AM
>> To: Fromm, Stephen (NIH/NIMH) [C]
>> Cc: [log in to unmask]
>> Subject: Re: Multi-masking for Multiple Comparison Correction
>>
>> Could you be more specific? I can't see what you mean by "the
>> mathematics dictates that there is no bias". It's important to avoid
>> misunderstandings about the terminology: bias is a technical term,
>> defined on the power function of the test, and does not mean just
>> wrong in some way. You should be sure that when you mention bias you
>> do not mean "conditional on the functional data", as I mentioned in
my
>> mail.
>>
>> R.V.
>>
>> <snip>
>>> Except in certain circumstances, where you could show that the
>> mathematics
>>> dictates that there's no bias, defining regions based on the
>> functional data
>>> itself can definitely bias results, regardless of whether the
>>> contrast is defined
>>> a priori.
>>>
>>> Perhaps one can argue that the bias is slight; and it's certainly
>> common
>>> practice in the neuroimaging community. But, again, procedures that
>> look to
>>> the data can lead to bias.
>>>
>>> Of course, if one uses separately acquired data to create the
>> contrast-
>>> defined ROI, that's a different matter.
>>>
>>>> In some specific instance, using the mask approach follows a clear
>>>> substantive logic. For example, if you are investigating individual
>>>> differences in cognitive capacity, you may be justified in carrying
>>>> out a contrast first, and then look at how individual differences
>>>> modulate the activation say, in prefrontal and parietal areas.
>>>>
>>>> You do have to pay for the increased power (if the procedure is
>> really
>>>> a priori), the price being that you potentially miss an effect in
> the
>>>> voxels outside the mask.
>>>>
>>>> I do not see any simple way in which the concept of bias relates to
>>>> this specific situation; I'd rather say that these tests are
>>>> conditional on the a priori criterion. If the criterion is not a
>>>> priori, they have wrong significance values (too small), with
>> inflated
>>>> type I errors.
>>>>
>>>> When you use a cluster approach, you also have to specify a priori
a
>>>> cluster definition threshold. Your p values are conditional on this
>>>> threshold. If you try several thresholds, your test will have wrong
> p
>>>> values.
>>>>
>>>> All the best,
>>>> Roberto Viviani
>>>> University of Ulm, Germany
>>>>
>>>> Quoting Amy Clements <[log in to unmask]>:
>>>>
>>>>> Dear Experts,
>>>>>
>>>>> I am pretty far away from having statistical expertise, which is
> why
>>>>> I am posing my question to the group. Recently, I have seen a
>>>>> multitude of papers that are using a multi-masking approach to
deal
>>>>> with corrections for multiple comparisons (using main effect or
>>>>> other effects of interest contrasts masks). While on the surface
>>>>> this appears to seem like an optimal approach because you are
>>>>> restricting the number of voxels included in the multiple
>>>>> comparison, it seems like an opportunity for biasing the data and
>>>>> obtained results--especially if you are not masking the data based
>>>>> from a priori hypotheses (e.g., using a previously defined
>>>>> functional ROI mask because you're interested in face processing).
>>>>>
>>>>> I'm not sure that I've articulated this is the best way. It
seems,
>>>>> like I mentioned previously, to have the potential to bias
results,
>>>>> but would greatly appreciate feedback. The questions typically
>>>>> asked from the lab that I've worked in have been better suited to
>>>>> utilizing a cluster-based approach; however, could also be served
> by
>>>>> multi-masking.
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>> Amy Stephens
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Disclaimer:
>>>>> The materials in this e-mail are private and may contain Protected
>>>>> Health Information. Please note that e-mail is not necessarily
>>>>> confidential or secure. Your use of e-mail constitutes your
>>>>> acknowledgment of these confidentiality and security limitations.
> If
>>>>> you are not the intended recipient, be advised that any
>>>>> unauthorized use, disclosure, copying, distribution, or the
taking
>>>>> of any action in reliance on the contents of this information is
>>>>> strictly prohibited. If you have received this e-mail in error,
>>>>> please immediately notify the sender via telephone or return
>> e-mail.
>>>>>
>>>
>>>
>>>
>>
>
>
>
> --
> School of Psychology and CLS
> University of Reading
> 3 Earley Gate, Whiteknights
> Reading RG6 6AL, UK
> Ph. +44 (0)118 378 7530
> [log in to unmask]
> http://www.personal.reading.ac.uk/~sxs07itj/index.html
> http://beclab.org.uk/
>
|