On 07-Aug-06 Ivar Andreas Aursnes wrote:
> On 04.08.2006 04:05, Rakesh Biswas wrote:
>> An interesting anecdote on the P value:
>> http://www.medscape.com/viewarticle/524206
>> Rakesh
>>
> Interesting, and correct, except for one possibly misleading
> word: "hypothesis".
> What he is referring to is obviously the statistical hypothesis
> (also called the null hypothesis). This is, as we all know,
> different from our scientific hypothesis which very seldomly
> is null. It can be expressed as a "prior" under the Bayesian paradigm.
>
> --
> Ivar Andreas Aursnes
I haven't registered with medscape, so have not read the anecdote.
But I must comment on the above remarks about "hypothesis".
Statistics has its own technical language, in which as in all
technical languages "ordinary" words get specialised and precisely
defined meanings.
So it is with the part of statistics called Hypothesis Testing (HT).
When HT first evolved, the hypotheses being tested were all what
we would now call Null Hyptothses -- for example Karl Pearson's
chisquared test for goodness of fit, addressing questions of the
kind "Do the data adhere to a Gaussian distribution?" So the
Null Hypothesis here is "The data adhere to a Gaussian distribution",
and Pearson's chi-squared is a measure of the discrepancy between
the histogram frequencies observed, compared with the frequencies
to be expected if the data were drawn from the best-fitting
Gaussian distribution.
It was only quite a while later, with the work of Neymann and
E.S. Pearson (KP's son) in the 1920s, that the importance of being
aware of an Alternative Hypothesis emerged -- stating a specific
alternative allows you to devise a test which has maximum power,
in testing the Null, to detect the Alternative. Conversely, any test
whatever of a Null implies certain kinds of Alternative, namely those
for which that particular test has high power (relative to Alternatives
for which it has low power).
The vocabulary of Null Hypothesis and Alternative Hypothesis has
been firmly established in statistics ever since, and anyone who
uses Hypothesis Testing methodology has to be aware of what these
terms precisely mean (or else they do not know what they are
talking about).
Therefore, in statistical Hypothesis Testing, you test the Null
Hypothesis, versus an (explicit or implicit) Alternative Hypothesis.
However, one increasingly finds language on the lines of
"We designed our trial so as to test our research hypothesis
that ... " (I think "research hypothesis" is more common than
Ivar's phrase "scientific hypothesis" above). This is "the wrong
way round" from the statistician's point of view: you do not
test the Alternative Hypothesis, you test the Null.
It is also the wrong way round from Karl Popper's paradigm of
the progress of scientific knowledge, namely that you cannot
prove that a theory is correct -- you can only demonstrate
that a theory is wrong, by showing that there are facts which
are not compatible with the theory. You then have the task of
devising an alternative theory which is compatible with these
awkward facts. So science advances by, in the first instance,
destroying theories.
Whatever one may think about this philosophically, that is the
way it is in statistical Hypothesis Testing.
Indeed, to go back to Ivar's comment, the "possibly misleading
word" is not "hypothesis". The misleading word is "test".
So when people write things like "We designed our trial so as
to test our research hypothesis that ... ", they need to replace
"test" with something else.
There is similar confusion around another word which is part
and parcel of Hypothesis Testing, namely "significant".
The degradation of language is evident in statements like
"The two groups did not differ significantly".
What this refers to is an outcome of the investigation which,
more accurately described, would be "the value of the statistic
used to compare the two groups did not attain a value which
went beyond the threshold for statistical significance at our
chosen significance level, in testing the Null Hypothesis that
there was no difference between the groups."
In other words, "significant" has the meaning implied by its
etymology -- sign-making: the data are raising a flag which
indicates that something is happening -- there is a degree
of incompatibility between the data and the Null HYpotheis,
so up goes the flag. In other words, "significance" really
refers to weight of evidence. A "significant difference"
(loose language if ever was) is not a difference which is
significant in the ordinary sense of a difference "which
matters" or "which is important"; it is a difference "which
was very unlikely to arise if the Null Hypothesis were true".
Finally, re Ivar's comment that "It can be expressed as a
"prior" under the Bayesian paradigm." That is somewhat beside
the point. A Bayesian approach is of course possible when
the available information supports the assignment of prior
probabilities to the possible hyptotheses being contemplated,
but it is not at all necessary in this discussion.
Best wishes to all,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <[log in to unmask]>
Fax-to-email: +44 (0)870 094 0861
Date: 07-Aug-06 Time: 11:23:15
------------------------------ XFMail ------------------------------
|