Hi Paul, [log in to unmask] said: > Shouldn't this modify our consideration of how to balance between the > two error types? If so, is there any better way of introducing the a > priori hypotheses that make us more confident in findings that are > subtle but predicted and (more importantly) replicable. It would be a > shame if such embryonic findings were scythed down by the multiple > comparisons adjustment before they could grow and blossom across a > series of experiments. > I would very like to hear further thoughts on this from Mathew and > others It's bit odd rehearsing our discussion, again, here on the list; like seeing your family on TV. Anyhow. To rehearse: Of course, as in any other discipline using hypothesis testing statistics, we have a problem with false negatives. Perhaps we will migrate in due course to an estimation approach - see e.g http://www.cu.mrc.ac.uk/~fet/multhip/ matstat.html, http://www.cu.mrc.ac.uk/~fet/wavestatfield/wavestatfield.html. In the mean time, what to do? This is of course a classic power problem, and there's a simple solution to this classic problem: more subjects. Another solution is to have a valid and highly specific area to restrict your testing to, but you would still need to use the corrected statistics for that small region (http://www.mrc-cbu.cam.ac.uk/Imaging/vol_corr.html). If, for some reason, neither of these is possible, then I think you are looking for some way of allowing an increased false positive rate, to reduce the false negative rate. As you know, my own view is that we have too many false positives in neuroimaging already. To pursue your horticultural analogy, we run the risk of the beautiful garden of brain imaging research being overgrown by weeds. Anyway, regrouping, even if we do go down the route of allowing an increased false positive rate, I think using uncorrected p values (when your area of interest is greater than one voxel) is a bad idea. This is for two reasons. First, the p value relates to the null hypothesis for one voxel only, and really has no meaning for an area larger than one voxel, as is almost invariably the case in a functional imaging experiment. Second, and related, the uncorrected p value can be dangerously misleading for authors and readers of functional imaging papers. The fact that it is 'a p value' gives the result a spurious weight, even though the probability is not for the correct null hypothesis. Thus, when we see 'p<0.001 uncorrected', we tend to think that this _must_ be significant, despite knowing that we have a huge multiple comparison problem. The tiny p value leads to the implicit feeling that the multiple comparison correction must somehow be too severe. But in fact, as you can demonstrate to yourself by playing with volumes of random numbers, the correction is very accurate, giving nearly exactly the required false positive rate (see http://www.mrc-cbu.cam.ac.uk/Imaging/randomfields.html and the .m file script therein). So, yes, I agree we have a problem with false negatives, but I don't think uncorrected p values are a good solution. And that's the rehearsal, See you, Matthew