JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for ALLSTAT Archives


ALLSTAT Archives

ALLSTAT Archives


allstat@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

ALLSTAT Home

ALLSTAT Home

ALLSTAT  2004

ALLSTAT 2004

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Non Parametric asumptions: replies

From:

K F Pearce <[log in to unmask]>

Reply-To:

K F Pearce <[log in to unmask]>

Date:

Mon, 24 May 2004 12:06:17 +0100

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (290 lines)

Dear all,

Here's a list of all the replies I got to my question.

Many thanks to all those who responded!

All the Best,
Kim.
****************************
My original question:
Hello everyone

Just a small query about non parametric statistics.....

It has been indicated to me that, when testing for differences between 2
groups, the Wilcoxon Signed Ranks Test and the Mann Whitney Test assume
that the distribution of the two groups have the same shape.  Even
though these distributions do not have to be normal..do they have to
have only 'one-peak'?

Also, for the Mann Whitney test and Wilcoxon Signed Ranks Test it is
stated in one text that the variables can be ordinal and in another text
(when talking about the Mann Whitney test)it is also stated that the
variable does not have to be on an interval scale; an ordinal scale is
sufficient.  However, I have been in conversation with various bods who
say that a continuous scale is assumed.  Can anyone shed some light?

Many thanks,
Kim.
**************************
For formal assumptions of the tests, I would check a text like
Hollander.

But more broadly, if one or both distributions have more than one peak
then what difference between the two distributions do you want to test?
************************************
Kim,


> -----Original Message-----
> From: K F Pearce [mailto:[log in to unmask]]
> Sent: Thursday, May 06, 2004 3:49 PM
> To: [log in to unmask]
> Subject: Non parametric assumptions
> 
> 
> Hello everyone
> 
> Just a small query about non parametric statistics.....
> 
> It has been indicated to me that, when testing for
> differences between 2
> groups, the Wilcoxon Signed Ranks Test and the Mann Whitney 
> Test assume
> that the distribution of the two groups have the same shape.  Even
> though these distributions do not have to be normal..do they have to
> have only 'one-peak'?

Indeed, the distributions need to have the same shape. But nothing else
is required, i.e. they may even have more than one peak.
 
> Also, for the Mann Whitney test and Wilcoxon Signed Ranks Test it is 
> stated in one text that the variables can be ordinal and in another 
> text (when talking about the Mann Whitney test)it is also stated that 
> the variable does not have to be on an interval scale; an ordinal 
> scale is sufficient.  However, I have been in conversation with
> various bods who
> say that a continuous scale is assumed.  Can anyone shed some light?

From a theoretical point of view, continuous scales are required as ties
are not allowed (even though corrections for ties are available).
However, in applications this can be weakend in case the "underlying"
distribution is continous. Assume the sweetness of something to be rated
on an ordinal scale 1 : 10. Then, in fact, you may observe more than
just 10 different values of the "true sweetness", i.e. sweetness is
continuous, you only measured it ordinal. In that case, Wilcoxon is
okay. However, you will have to assure that the number of ties remains
small, otherwise this does not hold anymore. I.e. with just 4 categories
and 20 observations, say, I would not recommend the use of Wilcoxon
anymore.

Best, 
****************************************************
Hello Kim

The Wilcoxon test does not assume that the 2 distributions are unimodal.

Also, there are ways of correcting for ties if the Y-variable is not 
continuous.

The main problem with the Wilcoxon test is that it is only a test, and
does 
not provide confidence intervals for a parameter. There are 2 possible 
confidence intervals corresponding to the Wilcoxon test. These are for 
Somers' D and for the Hodges-Lehmann median difference. It is possible
to 
calculate confidence intervals for these without even assuming that the
2 
groups have the same shape. More about this can be found in Newson
(2002). 
A pre-publication draft of this can be downloaded from my website (see 
below), and so can a few more documents on rank statistics.

I hope this helps.



References

Newson R. Parameters behind "nonparametric" statistics: Kendall's tau, 
Somers' D and median differences. The Stata Journal 2002; 2(1): 45-64.
***********************************************
Kim,

This one comes up regularly.

It all depends on how you state your null hypothesis.  If your null
hypothesis is that the probablity that a member of one population will
exceed a member of the other is equal to the probablity that a member of
one population will be less than a member of the other, then you need
make  no distributional assumptions.  This is sometimes written as the
two populations are not schochastically different.

If your null hypothesis is that the probablity that a member of one
population will exceed a member of the other is equal to zero, then you
need only assume that there are no ties, i.e. that distribution is
continuous (always an approximation, of course).  In practice, the exact
distributions for U and T are worked out for no ties and we approximate
when there are ties, which is usually the case in my experience.

If your null hypothesis is that the means of the two populations are the
same, you must assume that the distributions have the same shape,
differing only in location.  These tests are sometimes described as
tests of inequality of medians because under this assumption the
difference between the medians is equal to the difference between the
means.  If the assumption is true, the variances must be the same.  This
is unlikely if the disribution is not Normal.

In my book An Introduction to Medical Statistics, 3rd ed., I have an
example of a Mann Whitney U test which is significant even though the
medians are equal.  Nearly all the observations were zero.

Note that the means, medians, and shape of the distribution do not
appear in the calculations for these tests.  We only need assumptions
about them if we want to draw conclusions about them.

I hope this helps,
***********************************************************
Kim

My understanding is as follows:

The Mann-Whitney test assumes the two samples are drawn from identical
populations (i.e. that's the assumption of the null hypothesis).  This
'common' population can be of any shape.  The 'simplest' alternative
hypothesis is therefore that the two populations have the same shape,
but are 'shifted' relative to one another.

The Wilcoxon signed ranks test assumes that the distribution of
differences (between the data pairs) is symmetrical (of any shape).

The Mann-Whitney test assumes ordinal data, and that the data can be
ranked without ties -- so, not necessarily an interval scale, but
capable of an
unambiguous ranking.   Hence many stats packages apply a correction for
'ties' when estimating the significance level.

The Wilcoxon signed rank test makes similar assumptions about the
differences (between the data pairs) -- which implies that the original
measurements, from which the differences are calculated, must be at
least interval in scale.  (So, all in all, Wilcoxon signed ranks is not
as 'assumption-free' as many people think.)
*************************************************
Hi Kim
I had until now always assumed that one used Mann-Whitney and Wilcoxon
tests for ordinal data as well as non-normal continuous data, given that
the tests use ranks not absolute values (thus in my mind retaining the
features of an ordinal scale). I would be extremely interested in any
advice you receive on this topic, in case this assumption turns out to
be misguided...
Thanks
***********************************
Hi Kim,
It does not matter what the shape of the distribution is so long as (in
the null hypothesis being tested) it is the same distribution for both
samples. The continuity consideration comes in because the standard
computation of the distribution of the M-W/W test assumes that there
will be no ties (equal observations); this is assured for continuous
distributions (though observations rounded to a relatively small number
of significant figures, as most are, could give a few ties).

For ordinal variables, which typically do not have many categories, the
probability of a tie may be considerable, and in that case the standard
distribution will not be right (though how far it is wrong will depend
on the probability of ties).

There are various approaches to handling ties. One which makes sense in
these days when computer simulation is easy and quick is to break the
tie-clusters randomly (e.g. a cluster [ABB] could be A<B<B, B<A<B,
B<B<A) and re-compute the M-W statistic each time. You will end up with
a simulated distribution of P-values, which will indicate the loss of
information due to ties.

The rationale for this approach, if adopted, is that the data record a
more or less coarse grouping of an underlying continuous variable which
has not been directly observed. If you could see inside the bin, you
would be able to separate the results; but you can't. So you simulate it
by random tie-breaking. OK if the grouping is not really coarse.

Such a rationale, however, is not applicable to categorical variables
which, even if ordered, do not naturally correspond to an underlying
continuous variable.

For instance, you may have decided that listening to a classical concert
on the radio is "more cultured" than listening to "Moneybox" which is
"more cultured" than listening to pop music. So you go and sample people
from two neighbourhoods A and B and ask, for these categories, which one
they most recently listened to. Then you can test whether A is "more
cultured" than B.

You will have a lot of ties; there is no obvious underlying variable;
and, even if you use a Mann-Whitney type of statistic (sum of all (A,B)
pairs in which the B response is "more cultured" than the A response),
you certainly shouldn't be referring it to the Mann-Whitney
distribution, and tie-breaking for the purpose would be a very dubious
thing to do. You're in contingency-table territory here.

So there's a line to be drawn.

Hope this helps,
******************************************************
Kim,

Distributions for the Mann-Whitney and Wilcoxon matched-pairs
signed-ranks test do need to be the same for both groups, as you say,
but they do not need to be unimodal.

The minimum level of measurement for both tests is ordinal. The data do 
NOT have to be continuous. (They are often used for ranks, which are 
definitely discrete).
*****************************************************************
Hi Kim

Not sure about the answer to your 'one peak' question  (Though the tests
are for location of the median, which may not be particularly helpful
for multimodal distributions).  I suspect the answer to the issue of
whether variables have to be continuous or just ordinal is tied up with
ties;  the tests work  by ranking observations, so if several
observations have the same value and so the same rank, you loose a bit
of precision (? not sure that's right technical term).  If the scales
only have  a small number of categories, you'll be getting a lot of
ties,and this may become a serious issue

I'd be inetersted to hear what else people have to say on this
********************************************************
Kim

You ask 'can anyone shed some light?'. I suspect we may spread more fog,
but I will have a go.

The 'Mann- Whitney' (and Wilcoxon rank sum) test does not as far as I
can tell make any assumptions about the shapes of the distributions. Nor
does it assume that they are the same shape- this is often assumed but
the test will detect a wide range of alternatives. 

It is not a test for differences in mean, or median, of the
distributions; rather it is a test that the median of differences is
zero. Usually this is associated with differences in the 2 medians (and
means) but not necessarily.

Therefore it can be applied meaningfully to ordinal variables and
discrete distributions; however you will get a lot of ties and you need
to know how to allow for these. The best thing is probably to get some
decent software and rely upon the methods it uses.

Personally I am not convinced that there is much difference between
'ordinal' and 'interval' scales; the real difference is between discrete
and continuous scales, and all actual measurements are discrete.

The Wilcoxon Signed Ranks Test on the other hand is a test of
differences: like a 'paired- t-test' it is actually a one sample test.
The only assumptions it makes I think is that the differences are
independent and symmetrically distributed. How often one comes across
situations where these are valid but the data are too far from normal to
use a 'paired- t' is another matter. 

I am sure you will get lots of different answers as well.

Regards

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
2006
2005
2004
2003
2002
2001
2000
1999
1998


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager