Hi Paul,
Indeed, as Mike said, the analysis you are proposing sounds very fishy.
I agree, and I think it's unnecessary and convoluted when you can have
the result that matters, directly (and in fact better, without
unnecessary random fluctuations on the p-values), using randomise.
If I understood, you have 50 controls and often you'll have a patient,
the 51st subject, that you'll want to compare with the other 50. You
want to assess the difference using p-values and for this you need a
reference distribution. This reference you plan to construct using
resampling. For this you pick 25 controls, plus another control to be
the "test case" (the 26th subject, also taken from the pool of 50),
compute the difference between the 26th and the average of the other 25,
convert to a z-score, and repeat this many many times. If you have
enough computational resources to do all exhaustively, there are
50!/25!/25! ways to select the control group, times 25 for the 26th
case, times 26 for the number of possible shufflings. So, this far, your
distribution can have 50!/25!/25!*25*26, which btw is an enormous
number. Looks good, right? Let's call this big number X and continue.
Once you have the distribution, you won't be able to compare the
original 51st case with the other 50, because that would be a different
distribution. Instead, will have to randomly pick 25 out of the 50, and
compare it with these 25. But which do you pick? As before, there are
50!/25!/25! ways to select the control group. Once however selected, you
add the true test case, and for each combination you'll have only 26
permutations, so the number of possible statistics you can compute with
the test case is 50!/25!/25!*26, which is also an enormous number. Let's
call this Y. But now things don't look good anymore, because the
distribution that you are going to use contains only X/Y=25 times more
values than possible statistics. Where these 25 are coming from? These
are the 25 possible choices for the 26th subject that was used in the
original resampling. This is all what you have, despite the immense
numbers that appeared to be available.
Among the gazillions of permutations in X and Y, there are only very few
non-repeated values of the statistic. For X, only 25*26 and for Y, only
26. All others are redundant and don't need to be computed at all.
There is another way to see how this is problematic. Suppose now that
instead of a pool of 50 controls (or 25) you had just 3, called "a", "b"
and "c". Then the statistics for the possible resamplings would be (I'll
omit the conversion to z-score for simplicity):
a-(b+c)/2
b-(a+c)/2
c-(a+b)/2
The mean of this distribution is zero: i.e.,
((a-(b+c)/2)+(b-(a+c)/2)+(c-(a+b)/2))/3 = 0. When you come with a 4th
test case (the true patient), called "d", you'll compare it with either
(a+b)/2, (b+c)/2 or (a+c)/2, but not with (a+b+c)/3, because the last
would be from a different distribution that you don't know the shape of.
There is no principled way to decide which of (a+b)/2, (b+c)/2 or
(a+c)/2 should be taken, and you'll have then to compute the average
difference, but the average ((d-(a+b)/2)+(d-(b+c)/2)+(d-(a+c)/2))/3 is d
itself, so all this process of resampling was in vain, and you ended up
with the same value.
However, as you shuffle the subjects a, b, c, and d, then the values
become meaningful and can be used. But you don't need any coding for
this: it's all in randomise, without the convolutions, ready to use.
With this said, let's look at the questions, please see below:
Am 27.01.14 22:58, schrieb Paul Chou:
> Dear all FSL experts
>
> I have a statistical question about how to calculate corrected p value (voxel-wised and cluster-wised corrected p value) using FSL command line tool.
If you have a single image, and want it's cluster size, use the command
cluster. To get a p-value, you'll need the full distribution, so you'll
need many images produced, e.g., by permutation. You can make a shell
script that will run cluster, and capture (e.g., with awk) the size of
the largest cluster and store that to construct the null distribution on
its fullness. Or compare the size of each cluster in the original
(non-permuted image) with the largest cluster size for each permutation,
and increment a counter if the largest permuted larger or equal than the
non-permuted. At the end, divide the counter by the number of
permutations and that will be the FWE-corrected cluster-level p-value.
I'd advise against this, though, because the command cluster wasn't made
for this purpose. At each iteration of the for-loop, cluster will
uncompress the image, allocate space in the memory, load it, etc. Then
to increment, you'll need fslmaths, which also will uncompress, load,
etc, and this will all be rather slow. Instead, if you are going to
implement things, go with a full-fledged programming language. In Matlab
there is a command that does labelling, called bwconncomp, which can be
your starting point.
> The following thing is what I want to do and wish some of you could help me to figure out this problem.
>
> Rationale : I want to calculate the type I error rate of “one vs many statistical test (one patient vs many healthy controls)” of gray matter volume. I use Z-score to achieve this goal.
>
> Step 1: My dataset consist of 50 healthy subjects. I random select 26 subjects from this dataset and perform 100 times resample.
It's not possible to resample uniquely 100 times a set 25+1. As the
order of the 25 controls doesn't matter, the number of resamplings is
just 26.
> For each resample, the one normal control were treated as patient and the other 25 subjects were served as normal controls. I calculate the Z score for every resample using the following formula [(patient GMV - healthy control mean GMV) / standard deviation of healthy controls GMV].
>
> Step 2: After Z-score calculation, I need to set the significant level (p value) of this test to identify which brain area have significant difference between the patient and the normal controls (something like P < 0.05 = Z > 1.73). The area with significant statistical difference was related to the type I error rate. I think I need to consider the multiple comparison correction issue since the origin of this analysis is voxel-wised whole brain analysis , but I don’t know how to calculate the voxel-wised and cluster-wised FWE corrected P value using the FSL command line tool. After quick searching the FSL mail list, I think “fsl_glm” (which calculate the residual images) , “smoothest” (which calculate the data smoothness) may be needed for my demand, is it correct ? If yes, how to use these tools to calculate the voxel / cluster wised FWE corrected P value?
To do this all you need just 1 command: randomise. It will give voxel-
and cluster-level corrected results. Put all the 50+1 subjects in a 4D
file, make a design matrix, and just run. There are only 51 possible
permutations, and these are the only useful permutations. All others you
were considering are not more informative than these 51. The smallest
p-value will be 1/51 = 0.019. It isn't as small as you certainly would
like, but with just 50 controls, that's all what you can get. The other
strategy won't give stochastically smaller p-values, and if it happens
to give, it's because some coding went wrong.
All the best,
Anderson
> Sorry for this long question, I try to describe it more clearly and hope to hear feedbacks from you
>
> Thanks
>
> Best
>
> Paul
|