Dear Margaret,

 

(Apologies for posting a reply back to the list, but I felt duty bound to do so given that a resource from www.statstutor.ac.uk was mentioned in the original post)

 

The advice in the resource mentioned at http://www.statstutor.ac.uk/resources/uploaded/pearsons.pdf includes reference to the use of significance testing in relation to Pearson’s Correlation Coefficient (PCC) and in that case since the test is typically based on the usual assumptions of normality the normality assumption is required. The coefficient was originally designed to measure the degree of linearity between the two variables and so even when you are not undertaking a significance test on PCC, it seems sensible to demand that both variables are samples from a random variable on some sort of measurable scale (i.e. interval or ratio). Hence that would exclude ordinal data for which Spearman’s Rho is much more appropriate. Clearly nominal data are excluded if we are talking about a linear relationship. Therefore, I think the advice given on this particular resource on www.statstutor.ac.uk is correct and is what I routinely provide to my students. If I have this incorrect then I’d be delighted to be corrected by more experienced members of the list?

 

The advice given in the resource at http://www.sheffield.ac.uk/polopoly_fs/1.43991!/file/Tutorial-14-correlation.pdf refers to assuming normality when simply calculating PCC. Whilst I would not personally give that advice I would temper that with the fact that it is clear that there is indeed a lot of conflicting literature that argues for and against the assumption of normality when simply calculating PCC. Examples include the original paper by Pearson and that by Nefzger, M. D., and Drasgow, J. (1957), "The Needless Assumption of Normality in Pearson's r," The American Psychologist, 12, 623-625, which argues against the assumption of normality.

 

I hope this is helpful despite not being able to clear up all the confusion!

 

Best wishes

Alun

------------------------------------------------------------

Dr Alun Owen

Head of Mathematics

University of Worcester, Henwick Grove, Worcester WR2 6AJ

Tel: 01905 542212

 

Need help with Mathematics or Statistics?

www.mathcentre.ac.uk or www.statstutor.ac.uk

 



On Wednesday, 15 July 2015, Margaret MacDougall <[log in to unmask]> wrote:

Hello

 

I have recently encountered some conflicting advice on the assumptions for use of the Pearson correlation coefficient which has led me to question the correct advice to be offering non-specialists.

 

For example, at the site http://www.statstutor.ac.uk/resources/uploaded/pearsons.pdf , the following advice is provided: 

 

"The calculation of Pearson’s correlation coefficient and subsequent significance testing of it requires the following data assumptions to hold: · interval or ratio level; · linearly related; · bivariate normally distributed. In practice the last assumption is checked by requiring both variables to be individually normally distributed (which is a by-product consequence of bivariate normality). Pragmatically Pearson’s correlation coefficient is sensitive to skewed distributions and outliers, thus if we do not have these conditions [skewed distributions and outliers]  we are content. If your data does not meet the above assumptions then use Spearman’s rank correlation!"

 

By contrast, at the site, http://www.sheffield.ac.uk/polopoly_fs/1.43991!/file/Tutorial-14-correlation.pdf , the following advice, involving a weaker condition, is provided:

" When calculating the [Pearson] correlation coefficient it is assumed that at least one of the variables is Normally distributed."

 

Is there some way of deciding how to resolve the differences and can anyone suggest how they may have evolved? 

 

Thanks in advance for any relevant help provided.

 

Best wishes

 

Margaret

You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask]);" target="_blank"> [log in to unmask], leaving the subject line blank.

You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.

You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.