For more on the case against heteroskedasticity tests, and on the case
for preferring the Satterthwaite t-test most of the time, refer to the
seminal papers by Moser and Stevens (1992) and Moser, Stevens and Watts
(1989), which present a numerical integration studu on the subject.
Hope this helps.
Roger
References
Moser, B. K., and G. R. Stevens. 1992. Homogeneity of variance in the
two-sample means test. The American Statistician 46(1): 19-21.
Moser, B. K., G. R. Stevens, and C. L. Watts. 1989. The two-sample
t-test versus Satterthwaite's approximate F-test. Communications in
Statistics - Theory and Methods 18(11): 3963-3975.
Roger Newson
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [log in to unmask]
www.imperial.ac.uk/nhli/r.newson/
Opinions expressed are those of the author, not of the institution.
-----Original Message-----
From: A UK-based worldwide e-mail broadcast system mailing list
[mailto:[log in to unmask]] On Behalf Of Robert Newcombe
Sent: 11 June 2007 11:38
To: [log in to unmask]
Subject: Re: Tails on homoscedacity test
The important thing to recognise about the old-fashioned tables of
critical values for the F ratio is that the F distribution is used for
more than one purpose. The main issue here is the distinction between
using the F-ratio as a hypothesis test in an ANOVA table arising from
any kind of linear model, and using a ratio of empirical variances as a
direct test of H0: var1=var 2 vs. H1: var1 not = var2. (There is a third
use, constructing Clopper-Pearson 'exact' confidence limits for a
proportion, but that needn't concern us here - and is more readily done
using the beta distribution facilities in software, even Excel is fine
for this.)
As far as I can tell (experts on the history of statistics may correct
me on this) Fisher et al had the F distribution tabulated specifically
with the F test in mind. Consider for simplicity F with 1 and 60 df.
The critical value for a default 5% alpha level is 4.00. This is the
square of the critical value of t with 60 df for the usual 2-tailed
test, viz. 2.00. (It's worth remembering that this is 1.96+2.4/df, to an
excellent approximation, for all but very small df.) An F test with 1
and 60 df is essentially the square of a t test statistic (which could
be either unpaired or paired i.e. 1 sample based on paired differences).
We run the t-test 2-sided by default, and the single tail F probability
corresponds, because F will be >4 if t>2 or t<-2. F is a squared
measure, t is an unsquared one, so a two-tailed t-test generalises into
a 1-sided interpretation of F. (Sometimes the resulting F will be <1 -
i.e. if |t| < 1 - in this case H0 is simply not rejected.) In this
situation the numerator df is small, one less than the number of groups
being compared.
Comparing two empirical variance estimates is a totally separate issue.
Here, both df1 and df2 are usually large. It is usual to calculate F =
max/min, then refer to F tables with the appropriate df, and this is
then a ONE sided p-value for comparing them. When n1 and n2 are unequal,
the df in the numerator and denominator will depend on which sample
variance is the larger. I think it's normal to double the 1-sided
p-value, for consistency, but a case could be made for adding an
alternate-tail probability relating to 1/F.
HOWEVER, I wouldn't recommend this test. The snag is that it is highly
non-robust, extremely sensitive to departures from the tacitly assumed
Gaussian distributional form. In fact, it works just as effectively as a
test for normality as for heteroscedasticity (the two tend to co-exist
anyway). If you really want to compare the spread of two samples, only
(i.e. disregarding location), what is needed is something much more
robust. One possibility is the ancillary Levene test that SPSS uses to
try to help to choose between equal and unequal variances t-tests (true
t-test and Welch test). Pretend you're going to compare the two samples
for location using a t-test, but disregard all the output apart from the
first 2 columns of the pivot table that give the ancillary test. (When
using the SPSS unpaired t routine I always disregard this test as such,
as I prefer to use the more robust unequal-variances form of the test -
unless we have to generalise into an ANOVA model. Like ancillary tests
in general, it is more likely to signal cause for concern by p<0.05 when
sample sizes are large, but that is precisely when there is less concern
- so such tests are arguably unhelpful to the issue of comparing means.)
Hope this helps.
Robert G. Newcombe PhD CStat FFPH
Professor of Medical Statistics
Department of Primary Care and Public Health
Centre for Health Sciences Research
Cardiff University
4th floor, Neuadd Meirionnydd
Heath Park, Cardiff CF14 4YS
Tel: 029 2068 7247
Fax: 029 2068 7236
Home page
http://www.cardiff.ac.uk/medicine/epidemiology_statistics/research/stati
stics/newcombe
For location see
http://www.cardiff.ac.uk/locations/maps/heathpark/index.html
>>> Jay Warner <[log in to unmask]> 10/06/07 06:23:16 >>>
I believe the convention is to always use the F value larger than 1
(i. e., select var-1 and var-2 so that the F ratio is > 1). Adding
this information forces the F-test to be one tailed.
I believe it came from the days when the tables didn't always have
enough precision in the smaller values of F, below 1. But what do I
know -- I wasn't there.
This is neither more nor less conservative.
Of course, if you are going to set up CI's for the variance, then you
need both.
On Jun 9, 2007, at 8:28 PM, David B. Klein <[log in to unmask]> wrote:
> I'm wondering what the rationale would be for using one-tailed
> hypotheses on an F-ratio test of variances as a default. I notice
> Excel does it this way. It's more conservative, to be sure, but why
> not just lower the alpha level if that's what you're after? I don't
> see what in this situation implies a one-tailed test ... you are
> interested in equality of variances, not in one being greater than
> the other. (?)
|