Hi, Roberto.
I find your arguments interesting, and would like to explore your arguments
further.
As I understand it, FDR correction is based upon the p-values among all
voxels tested; those falling below a line with a user-defined slope have met
the statistical threshold. (This is a simplified explanation of the
Genovese et al., 2002 description.) As you seem to suggest, false positives
should be spatially distributed, and not localized to a specific region.
Wouldn't a FDR correction in combination with a spatial extent threshold
(e.g., 10-15 voxels) virtually eliminate false positives? Inspection of
many activation maps suggest this is indeed the case, and certainly it
eliminated many "stray" voxels in my analysis. I have seen monte carlo
simulations which support this idea after evaluating the interaction of
cluster size and t-values -- but this is not considered relevant by
reviewers unless the simulations are applied to the dataset under review.
This addition of a spatial extent threshold is similar to your idea of
looking at cluster-level statistics -- but theoretically it is often harder
to justify using a cluster-level threshold.
More of a reach -- would you consider / allow using FDR correction to
identify likely areas of activation that can be used as the center of a
region of interest? (Not for looking for significant activation of voxels
within the ROI, but for determining whether the average activation of the
ROI is signficant. If a couple voxels are false-positives, the mean
activity across a larger area that includes these voxels is unlikely to show
significant activation.)
In any case, FDR is certainly more sensitive for showing the extent (as well
as location) of activation than other methods of multiple-comparisons
correction, and I can't see throwing out the baby with the bathwater just
because a few voxels out of many might be false-positives.
_______________
Doug Burman, Ph.D.
----- Original Message -----
From: "Roberto Viviani" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Wednesday, October 10, 2007 8:23 AM
Subject: Re: [SPM] FDR correction
Hallo Doug,
I have two arguments against FDR corrections. The first is that the
there is no guarantee on where the 'false discoveries' occur. Thus,
the presence of a large activation in one place presumably allows the
emergence of false significance elsewhere. Presumably, voxels with a
larger effect are those where the discovery is least likely to be
false. But then, precisely those voxels where the effect is less
large, and are brought out by the FDR inference, are those were the
inferential status becomes less clear. This is the effect: large
activation in the occipital cortex, unclear result for small
activation elsewhere.
The second problem I have is that the p values typically become the
same over large ranges of voxels. I have a problem here, because I
rely on p values as an index of the confidence of the inference. This
info appears in part to be lost in many or most FDR analyses.
There is little that can be done about the second problem other than
reporting the Z/t values, but surely one can discuss the distribution
of the activation re. the first problem in an attempt to buttress
one's conclusions.
Having said this, I also think that there are strong arguments against
enforcing voxel-level FWE rates in neuroimaging statistical inference,
and specifically those arising from RF theory. This inference allows
me to say for each individual voxel that it is significantly active at
the nominal p FW level irrespective of the status of any other voxel.
Here, one contemplates the counterfactual situation in which any voxel
were active in isolation. Is this level of strength what we need in
neuroimaging? I'd say it is not. Typically, we are not in the business
of inferring anything about any individual voxel. Rather, what we
usually need is something like cluster-level inference, where each
individual cluster is significantly active, but we care little about
the individual voxels within the cluster.
In summary: cluster level inference is weaker than voxel level
inference, but stronger than FDR in some relevant sense; requiring
voxel level inference at all costs is unreasonable -- at least if
distributional assumptions hold.
There follows: the 'large clusters' of your post should be significant
at cluster level, if they are really large.
Incidentally, a Z peak value of 5 or more is likely to be significant
with a permutation approach, which gives a type of voxel level
inference, but is less conservative than RF theory.
I realize this is not likely to be what you would like to hear -- apologies!
All the best
Roberto Viviani
Dept. Psychiatry
Univ. of Ulm, Germany
Quoting Doug Burman <[log in to unmask]>:
> I have an article that has been reviewed where the only major complaint
> was that the reviewer would not accept the results as valid because we
> used a FDR correction (p=0.05) -- even though our cluster sizes were
> fairly large, we also used an extent threshold of 25, and our Z-scores
> were generally greater than 5.0. The editor is backing him up, and
> refuses to publish our findings unless we satisfy him that our "result is
> not a chance finding".
>
> Many of our primary findings would survive a FWE correction if we applied
> a mask. I find it disturbing, however, that a FDR correction is not
> considered an acceptable method for multiple-comparisons correction by
> this reviewer / editor, and some highly-informative brain / behavior
> correlations in our study require this correction. Any suggestions on
> articles and explanations on the validity of the FDR approach?
>
> (I know this has been discussed on the list before, but I suspect the
> listserv discussion will not in itself satisfy the editor.)
>
> Doug Burman
>
|