Dear Howard
Our stats group has just discussed this over lunch. Points made are
somewhat similar to those already provided.
We disagree with the "comment in another forum" . Usually a confidence
interval is computed making the same assumptions about distributional forms
as significance tests use in computing a P-value. We are used to seeing
95% confidence intervals computed as a value plus-or-minus 1.96 times the
standard error - this is using the same large sample approximations that
are used to compute a z-value by dividing a value by its standard error,
from which a P-value is computed using the Normal distribution.
In some circumstances P-values are computed using a slightly different
large sample approximation than that used in the confidence interval
computation. For the risk difference situation you probably use Fisher's
test or a chi-squared test for your P-value, and the normal approximation
to compute a standard error for computation of the confidence
interval. Some might argue that it is the approximations in computing the
P-value that are less the devil here, and I would prefer to trust the
P-value over the confidence interval, although the difference must be
negligible, and only of importance to those who live their lives concerned
that P=0.049999 is really very different from P=0.0500001. Most of us
think there are more important things in life to worry about.
Confidence intervals help us estimate effects, and are very important in
telling us how a range of possible results which we should bear in
mind. If you use a P-value simply to dichotomise the world into
significant or not significant, then we agree that there does not appear to
be a reason to include them. But P-values also allow us to directly
measure something about the strength of evidence. For example, noting
differences between P-values of 0.049 and P-values of 0.00001. This is
especially true when you are looking at monitoring clinical trials, where
stopping rules are formulated on the strength of evidence. It isn't a
one-to-one relationship between width of the confidence interval and
P-value, as the size of the effect also has a role.
What we do think is outrageous are journals which still round our P-values
to P<0.05, rather than let us give the exact values. Also journals who
waste space giving the chi-squared or F-value, its degrees of freedom, and
the P-value, especially when they tell us that there is no space left for
the confidence interval.
Jon Deeks
Centre for Statistics in Medicine
Oxford
|