Paul Macey wrote:
> Now intuitively it seems to me that VBM with one tidy scan is a more
> valid way to go that using multiple noisier scans in an n*k anova model.
Okay, in the usual "I'm not a statistician, but..." manner, in the
following unrealistic example "a" is supposed to represent three
repeat scans (down columns) on four "controls" (across rows), b is the
same for "patients" (who have an extra 1 unit of some signal or
other). We're interested in asking if patients and controls differ.
>> randn('state',0); a = randn(3,4); b = 1+randn(3,4);
>> [h p ci st] = ttest2(b(:),a(:)); t = st.tstat, p
t =
3.1795
p =
0.0043
>> [h p ci st] = ttest2(mean(b),mean(a)); t = st.tstat, p
t =
4.1128
p =
0.0063
The first t-test treats the repeat scans as extra data (in a
statistically dodgy way, see later) while the second averages over the
repeats. The second test has a higher t-value because the averaging
has increased SNR, but the first has a more significant p-value due to
the increased degrees of freedom. But...
> I'm not a stats wiz but do extra T1 scans really count as extra
> measurements?
...well, I think they do *but* they will of course be extremely
strongly correlated (aka there will be a complete breakdown of the
sphericity assumption), so the above analysis isn't really right.
However, you could do a proper repeated measures analysis that would
estimate and account for the correlation (I think), I believe by
reducing the effective degrees of freedom. Whether this proper
analysis would show the same results, I'm not entirely confident, but
I think there should still be a slight DoF advantage for anova since
the correlation won't be 100%.
In general, averaging represents a loss of information, and there is a
school of thought that says that throwing info away must be worse than
properly accounting for all of the info in a *suitable* model.
For fewer "I think"s and more info on what a suitable model might be,
you'll have to wait for a proper statistician to reply ;-)
Also, there's another question here, aside slightly from the stats,
which is whether the nicer-looking averaged scan will segment better
than the individual ones (probably), and whether this will be better
or worse than the result of averaging the individual segmentations
(not so clear cut to me...), and/or the effective "averaging" of the
segmentations that would occur in the anova. I think I favour the
third option, again because it keeps as much information as possible,
including some information about the variability in the segmentation
performance, which would be lost with either form of averaging.
Hopefully some statisticians will join what I think is quite an
interesting debate. Anyway, I hope this was of interest,
Ged.
|