JISCMail - ALLSTAT Archives

Dear all,

Here's a list of all the replies I got to my question.

Many thanks to all those who responded!

All the Best,
Kim.
****************************
My original question:
Hello everyone

Just a small query about non parametric statistics.....

It has been indicated to me that, when testing for differences between 2
groups, the Wilcoxon Signed Ranks Test and the Mann Whitney Test assume
that the distribution of the two groups have the same shape.  Even
though these distributions do not have to be normal..do they have to
have only 'one-peak'?

Also, for the Mann Whitney test and Wilcoxon Signed Ranks Test it is
stated in one text that the variables can be ordinal and in another text
(when talking about the Mann Whitney test)it is also stated that the
variable does not have to be on an interval scale; an ordinal scale is
sufficient.  However, I have been in conversation with various bods who
say that a continuous scale is assumed.  Can anyone shed some light?

Many thanks,
Kim.
**************************
For formal assumptions of the tests, I would check a text like
Hollander.

But more broadly, if one or both distributions have more than one peak
then what difference between the two distributions do you want to test?
************************************
Kim,


> -----Original Message-----
> From: K F Pearce [mailto:[log in to unmask]]
> Sent: Thursday, May 06, 2004 3:49 PM
> To: [log in to unmask]
> Subject: Non parametric assumptions
> 
> 
> Hello everyone
> 
> Just a small query about non parametric statistics.....
> 
> It has been indicated to me that, when testing for
> differences between 2
> groups, the Wilcoxon Signed Ranks Test and the Mann Whitney 
> Test assume
> that the distribution of the two groups have the same shape.  Even
> though these distributions do not have to be normal..do they have to
> have only 'one-peak'?

Indeed, the distributions need to have the same shape. But nothing else
is required, i.e. they may even have more than one peak.
 
> Also, for the Mann Whitney test and Wilcoxon Signed Ranks Test it is 
> stated in one text that the variables can be ordinal and in another 
> text (when talking about the Mann Whitney test)it is also stated that 
> the variable does not have to be on an interval scale; an ordinal 
> scale is sufficient.  However, I have been in conversation with
> various bods who
> say that a continuous scale is assumed.  Can anyone shed some light?

From a theoretical point of view, continuous scales are required as ties
are not allowed (even though corrections for ties are available).
However, in applications this can be weakend in case the "underlying"
distribution is continous. Assume the sweetness of something to be rated
on an ordinal scale 1 : 10. Then, in fact, you may observe more than
just 10 different values of the "true sweetness", i.e. sweetness is
continuous, you only measured it ordinal. In that case, Wilcoxon is
okay. However, you will have to assure that the number of ties remains
small, otherwise this does not hold anymore. I.e. with just 4 categories
and 20 observations, say, I would not recommend the use of Wilcoxon
anymore.

Best, 
****************************************************
Hello Kim

The Wilcoxon test does not assume that the 2 distributions are unimodal.

Also, there are ways of correcting for ties if the Y-variable is not 
continuous.

The main problem with the Wilcoxon test is that it is only a test, and
does 
not provide confidence intervals for a parameter. There are 2 possible 
confidence intervals corresponding to the Wilcoxon test. These are for 
Somers' D and for the Hodges-Lehmann median difference. It is possible
to 
calculate confidence intervals for these without even assuming that the
2 
groups have the same shape. More about this can be found in Newson
(2002). 
A pre-publication draft of this can be downloaded from my website (see 
below), and so can a few more documents on rank statistics.

I hope this helps.



References

Newson R. Parameters behind "nonparametric" statistics: Kendall's tau, 
Somers' D and median differences. The Stata Journal 2002; 2(1): 45-64.
***********************************************
Kim,

This one comes up regularly.

It all depends on how you state your null hypothesis.  If your null
hypothesis is that the probablity that a member of one population will
exceed a member of the other is equal to the probablity that a member of
one population will be less than a member of the other, then you need
make  no distributional assumptions.  This is sometimes written as the
two populations are not schochastically different.

If your null hypothesis is that the probablity that a member of one
population will exceed a member of the other is equal to zero, then you
need only assume that there are no ties, i.e. that distribution is
continuous (always an approximation, of course).  In practice, the exact
distributions for U and T are worked out for no ties and we approximate
when there are ties, which is usually the case in my experience.

If your null hypothesis is that the means of the two populations are the
same, you must assume that the distributions have the same shape,
differing only in location.  These tests are sometimes described as
tests of inequality of medians because under this assumption the
difference between the medians is equal to the difference between the
means.  If the assumption is true, the variances must be the same.  This
is unlikely if the disribution is not Normal.

In my book An Introduction to Medical Statistics, 3rd ed., I have an
example of a Mann Whitney U test which is significant even though the
medians are equal.  Nearly all the observations were zero.

Note that the means, medians, and shape of the distribution do not
appear in the calculations for these tests.  We only need assumptions
about them if we want to draw conclusions about them.

I hope this helps,
***********************************************************
Kim

My understanding is as follows:

The Mann-Whitney test assumes the two samples are drawn from identical
populations (i.e. that's the assumption of the null hypothesis).  This
'common' population can be of any shape.  The 'simplest' alternative
hypothesis is therefore that the two populations have the same shape,
but are 'shifted' relative to one another.

The Wilcoxon signed ranks test assumes that the distribution of
differences (between the data pairs) is symmetrical (of any shape).

The Mann-Whitney test assumes ordinal data, and that the data can be
ranked without ties -- so, not necessarily an interval scale, but
capable of an
unambiguous ranking.   Hence many stats packages apply a correction for
'ties' when estimating the significance level.

The Wilcoxon signed rank test makes similar assumptions about the
differences (between the data pairs) -- which implies that the original
measurements, from which the differences are calculated, must be at
least interval in scale.  (So, all in all, Wilcoxon signed ranks is not
as 'assumption-free' as many people think.)
*************************************************
Hi Kim
I had until now always assumed that one used Mann-Whitney and Wilcoxon
tests for ordinal data as well as non-normal continuous data, given that
the tests use ranks not absolute values (thus in my mind retaining the
features of an ordinal scale). I would be extremely interested in any
advice you receive on this topic, in case this assumption turns out to
be misguided...
Thanks
***********************************
Hi Kim,
It does not matter what the shape of the distribution is so long as (in
the null hypothesis being tested) it is the same distribution for both
samples. The continuity consideration comes in because the standard
computation of the distribution of the M-W/W test assumes that there
will be no ties (equal observations); this is assured for continuous
distributions (though observations rounded to a relatively small number
of significant figures, as most are, could give a few ties).

For ordinal variables, which typically do not have many categories, the
probability of a tie may be considerable, and in that case the standard
distribution will not be right (though how far it is wrong will depend
on the probability of ties).

There are various approaches to handling ties. One which makes sense in
these days when computer simulation is easy and quick is to break the
tie-clusters randomly (e.g. a cluster [ABB] could be A<B<B, B<A<B,
B<B<A) and re-compute the M-W statistic each time. You will end up with
a simulated distribution of P-values, which will indicate the loss of
information due to ties.

The rationale for this approach, if adopted, is that the data record a
more or less coarse grouping of an underlying continuous variable which
has not been directly observed. If you could see inside the bin, you
would be able to separate the results; but you can't. So you simulate it
by random tie-breaking. OK if the grouping is not really coarse.

Such a rationale, however, is not applicable to categorical variables
which, even if ordered, do not naturally correspond to an underlying
continuous variable.

For instance, you may have decided that listening to a classical concert
on the radio is "more cultured" than listening to "Moneybox" which is
"more cultured" than listening to pop music. So you go and sample people
from two neighbourhoods A and B and ask, for these categories, which one
they most recently listened to. Then you can test whether A is "more
cultured" than B.

You will have a lot of ties; there is no obvious underlying variable;
and, even if you use a Mann-Whitney type of statistic (sum of all (A,B)
pairs in which the B response is "more cultured" than the A response),
you certainly shouldn't be referring it to the Mann-Whitney
distribution, and tie-breaking for the purpose would be a very dubious
thing to do. You're in contingency-table territory here.

So there's a line to be drawn.

Hope this helps,
******************************************************
Kim,

Distributions for the Mann-Whitney and Wilcoxon matched-pairs
signed-ranks test do need to be the same for both groups, as you say,
but they do not need to be unimodal.

The minimum level of measurement for both tests is ordinal. The data do 
NOT have to be continuous. (They are often used for ranks, which are 
definitely discrete).
*****************************************************************
Hi Kim

Not sure about the answer to your 'one peak' question  (Though the tests
are for location of the median, which may not be particularly helpful
for multimodal distributions).  I suspect the answer to the issue of
whether variables have to be continuous or just ordinal is tied up with
ties;  the tests work  by ranking observations, so if several
observations have the same value and so the same rank, you loose a bit
of precision (? not sure that's right technical term).  If the scales
only have  a small number of categories, you'll be getting a lot of
ties,and this may become a serious issue

I'd be inetersted to hear what else people have to say on this
********************************************************
Kim

You ask 'can anyone shed some light?'. I suspect we may spread more fog,
but I will have a go.

The 'Mann- Whitney' (and Wilcoxon rank sum) test does not as far as I
can tell make any assumptions about the shapes of the distributions. Nor
does it assume that they are the same shape- this is often assumed but
the test will detect a wide range of alternatives. 

It is not a test for differences in mean, or median, of the
distributions; rather it is a test that the median of differences is
zero. Usually this is associated with differences in the 2 medians (and
means) but not necessarily.

Therefore it can be applied meaningfully to ordinal variables and
discrete distributions; however you will get a lot of ties and you need
to know how to allow for these. The best thing is probably to get some
decent software and rely upon the methods it uses.

Personally I am not convinced that there is much difference between
'ordinal' and 'interval' scales; the real difference is between discrete
and continuous scales, and all actual measurements are discrete.

The Wilcoxon Signed Ranks Test on the other hand is a test of
differences: like a 'paired- t-test' it is actually a one sample test.
The only assumptions it makes I think is that the differences are
independent and symmetrically distributed. How often one comes across
situations where these are valid but the data are too far from normal to
use a 'paired- t' is another matter. 

I am sure you will get lots of different answers as well.

Regards