Print

Print


you are right. the reviewer is wrong. As you say yourself, the chance that these six p-values are all drawn from a [0;1] uniform distribution is very small. Actually, the "correction" should go the other way: even if all of the were between (say) 0.05 and 0.15, the over-all conclusion might well be to reject the null hypothesis. What you might do is to add the squares of the six z-statistics. Assuming each z-statistic is N(0,1) and they are independent, the result would be chisquare (df=6) distributed.

By the way, it doesn't matter if the degrees of freedom are the same for all the six tests. p-values are p-values. But I would prefer to combine the underlying test statistics rather than combining the p-values. If the p-values come from normal distributted test statistics (this is assymtotically true for many standard tests), then the chi-square test I suggested is valid, and it gives a p-value of 2.6E-11 in this case. Which should be signifcant enough even for a physics journal!


On Wed, Aug 29, 2012 at 10:25 AM, Andy Cooper <[log in to unmask]> wrote:
Dear All,

I have the following question which I am hoping native statisticians (I am a
physicist by training) can help me address. To set the background as to why I am asking this question: a manuscript recently submitted to a Physics journal got rejected because 1 of the 3 reviewers claims that a result presented in the given manuscript is wrong. I would therefore be very grateful to hear the opinion of statisticians.

The issue in question is as follows: Suppose we have 6 statistics (e.g z-statistics), each derived from an independent data set (i.e 6 independent data sets in total). We can assume that the number of degrees of freedom is the same in each data set, so that the corresponding P-values are also comparable. We can further assume that each independent data set is a sample from an underlying population. Under the null hypothesis (z=0), the P-values would be distributed uniformly between 0 and 1. Now, the observed P-values are in fact (3e-9, 0.04, 0.05, 0.03, 0.02, 0.005), i.e they are all less than 0.06. It is clear, at least to me, that the chance that these P-values are drawn from a uniform distribution is pretty small (<1e-8). Yet the reviewer in question claims that there is no overall significance. His/her argument is based on the Bonferroni correction: using a threshold of 0.05/6~0.008 only 2 P-values pass this threshold, which he/she then goes on to claim is not meaningful enough.

My response to the reviewer's comment is that the use of a Bonferroni correction to establish the overall significance of the 6 P-values is wrong. The Bonferroni correction is ill-suited for this particular application since it is overly conservative, leading to a large fraction of false negatives. Remarkably, the editor of the Physics journal in question finds the reviewers arguments (i.e using the Bonferroni correction) as "persuasive".

I would be most grateful for your comments.

A
You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.


You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.