Dear List,
I need the assistance of the list to fetch literature concerning
reporting of results in one of my papers. The specific problem is the
use of uncorrected p levels to define clusters and illustrate results.
The reviewer claims that, having defined clusters in this way, it is
no longer possible to claim significance using FDR-correction.
Furthermore, the reviewer claims that it is misleading to include
non-significant results in tables (text below).
My questions are:
- is anyone aware of a paper on the use of uncorrected thresholds to
define clusters?
- is anyone aware of a paper justifying reporting uncorrected peaks in
tables (for example to facilitate meta-analyses)?
- is anyone aware of a paper justifying thresholding images at
uncorrected levels for illustration purposes?
For those who feel like giving advice on responding, the text is below.
Thank you in advance
Roberto Viviani
University of Ulm, Germany.
REVIEWER's TEXT
What is implied to the reader by stating that "Correction for multiple
comparisons was obtained through the false discovery rate (FDR)
approach" is that ALL VOXELS within regions (clusters) listed were
above the threshold for multiple-comparisons, not just that there was
at least one (or more) peak voxels within the cluster that exhibited
such an effect size. In other words, we are concerned with the
significance threshold for the blobs, not the peaks. Readers are
rarely interested in the effect size of a particular voxel. Based on
the authors' response, I'm concerned a false impression is being made
(not necessarily by intention, but in interpretation). In sum, the
threshold for statistical significance for outputted results in the
appended SPM tables are rather clearly NOT CORRECTED FOR MULTIPLE
COMPARISONS. I appreciated the authors attaching this output to make
the point crystal clear.
The authors also state on page 6 of their paper "Cluster-level tests
were conducted on clusters defined by the threshold of p = 0.005,
uncorrected". Instead what should have been stated, by the SPM output
given, was that the threshold for statistical significance was p <
.005 UNCORRECTED for multiple comparisons, with a cluster size
threshold of 50 voxels. PERIOD. THERE WAS NO FURTHER CORRECTION FOR
MULTIPLE COMPARISONS WHATSOEVER. But instead, by the language given,
we should expect the cluster sizes to represent voxels all with
p-values < .05 FDR-corrected - this is extremely unlikely.
|