Matthew Brett wrote:
[a considered reply, with a few points I'd like to follow up on]
In reply to my comment that an alert reviewer should catch this kind
of error:
> Is that true? How often have you seen the issue raised?
I really take this to be a specific example of a more general problem,
which is conclusions supported by non-statistical comparisons. I
can't say how often the issue gets raised in general, but certainly
it's the single most common problem I see in studies I review (and I
even let one slip through in a study I was middle author on last
year). So I would say the prevalence is very high, and given that I
see it more often in articles I review than in published articles, I'd
guess many reviewers do catch it. It depends on your tolerance,
though. When someone tells a story in the discussion about the
network of regions they see activated, that doesn't bother me as much
as it bothers some people. If they put it in the results section, it
bothers me.
> The argument would be that you cannot make any strong statement of
> localization of function with the thresholded map; how many papers
> make this clear? - "area A was significantly activated, but of
> course that isn't to say the whole brain wasn't activated about the
> same amount, who knows?".
I guess I don't think it's fair to expect articles to explicitly
describe what inferences can't be made from the data. I'm happy with
just, "area A was significantly more active during A than B." If the
worry is under-educated readers, though, certainly I think some
boilerplate text like that would be better than hoping readers will
get the point from additional figures.
> To get round this, you would have to compare activation in brain
> areas directly, and this is extemely rare - don't you think?
Comparing activity between regions is indeed unusual, but that's not
always the right remedial step. Sometimes it's enough to reword a few
things to omit the unsupported inferences. If specific differences
between regions are important to the study, then I'd rather see the
direct comparison than the continuous map.
Obviously, when you have a big fishing expedition, it's hard not to
make something of the pattern of activated regions. You can't
generally do do pairwise comparisons of regions, since by definition a
fishing expedition involves a large number of regions (perhaps the
number of resels). In that case, you mostly want a list of the
regions for which you have evidence of some relationship to your
design. But I can see the vale of unthresholded maps in supporting
the eyeball meta-analysis. If you have a lot of under-powered studies
of similar phenomena, you can potentially learn a lot from the
sub-threshold commonalities. The argument could be made that the
standard criterion in this case is counter-productive (anyone want to
run some simulations?).
> It's true that the continuous map on its own does not provide you
> with such a test, but at least it allows you a preliminary
> comparison, and makes the problem much clearer - and to me this
> seems such a fundamental problem that it deserves this attention.
I'm still not sure this is true. Sticking with an L>R example, the
most helpful case I can think of is when the authors mistakenly claim
L>R and it looks dubious on the map. But if you're going to require
authors to provide continuous maps to support localizing claims, why
not go the whole distance and require them (quite reasonably) to
support everything they consider important statistically?
I guess the other case when I'd really like to see it is when the
thresholded map fails to support some useful generalization about the
pattern of activity. The whole brain being more active for A than B
is one good example -- in an under-powered study you could get a
misleading picture of distributed activity. You can imagine subtler
cases.
One thing we haven't talked about is the kinds of invalid inferences
encouraged by unthresholded maps. If you have maps from under-powered
studies of two tasks (B-A and C-A), side-by-side comparison is liable
to suggest some obvious but false differences and/or similarities.
dan
|