I am puzzled by something which seems to crop up consistently in biology
textbooks for A-level students, and therefore in the teaching of my biology
colleagues.
When using fairly large samples (e.g. about 50 in each) they always seem to
use a version of the two-sample t-test which I don't recognise.
The formula which they use does not involve the pooled estimate of the
assumed (or tested by an F-test) common variance. Instead, it uses the sum
of the squares of the estimates of the separate standard errors. It seems to
be assumed that Normal distributions are the appropriate models for the two
populations. The degrees of freedom is what I expect: the sum of the two
sample sizes minus two.
I would expect them to use a z-test, based on the Central Limit Theorem ...
or a Mann-Whitney test, or the Normal approximation to it.
Having read a book last Easter which said that the t-test coped well with
deviations from a Normal model, I am less anti the t-tests than I was
previously! The same book mentioned the problems in testing for a common
variance, and therefore dismissed the two sample t-test, with which I am
familiar, as being almost useless.
I think of a t-distribution as a N(0,1) divided by the root of (a
chi-squared
distribution divided by its degrees of freedom) ... so I don't see how the
formula used by the biologists results in a t-distribution.
Would someone please explain, in simple terms, what is going on here?
Thanks in advance.
Bill
|