Just spoke to Paolo who developed the model. He thinks there is an issue that was spotted by Kevin. As he never received a copy of the User Manual for the software he cannot verify what is the problem. I will send him a copy and circulate the response.

 

Adam

 

 

 

--

Adam Czarnecki

Divisional Director
Clancy Consulting Ltd.
Dunham Court
2, Dunham Road
Altrincham
Cheshire
WA14 4NX

Tel: 0161 613 6000
Fax: 0161 613 6099

Clancy Consulting Ltd.
Registered Office: 2 Dunham Road, Altrincham, Cheshire, WA14 4NX
Registered in England No: 3693529

We cannot accept any liability for any loss or damage sustained as a result of software viruses. It is your responsibility to carry out such virus checking as is necessary before opening any attachment.

The information contained in this message is private and confidential. It is intended only for the use of the named E-Mail addressee. If you are not the named E-Mail addressee please E-Mail or telephone us immediately with your confirmation that you have destroyed it. In no event should you disclose the contents of this E-Mail to any other person nor copy, use, print, distribute or disseminate it or any information contained in it. Thank you for your co-operation.

Please visit our website at www.clancy.co.uk

 


From: Contaminated Land Management Discussion List [mailto:[log in to unmask]] On Behalf Of Kevin Privett
Sent: 23 July 2008 12:28
To: [log in to unmask]
Subject: CIEH stats - outlier test

 

Can anyone offer clarification of the outlier test?

 

Appendix B (3.) of the CIEH guidance states the Grubb’s Test assumes the other data values in a dataset, except for the suspect observation, are normally distributed.  It tells you to check the normality of the remaining dataset using the method in Appendix C.

 

Appendix B (4.) states that if the (remaining) dataset is non-normal, consider using a log transform and check if this is normal.

 

However, when you use the Statistics Calculator spreadsheet it appears to do something different.

 

On the outlier test sheet you have a choice of drop-down “use normal distribution to check for outliers” or “use log-normal distribution to check for outliers”. 

 

This appears to be checking for outliers based on the distribution of the whole dataset, ie including any suspect values, not on the dataset once outliers are removed.

 

Here is an example: take the following dataset and assume the critical value (SGV) is 20:-

14

9

13

19

14

14

14

11

18

18

28

38

 

If you follow the instructions for the stats calculator the summary page tells you this is a non-normal dataset, so you choose “log-normal to check for outliers” and it says there are no outliers.  As it is non-normal, the Chebychev Test is used to calculate the UCL (=27.6) which exceeds the critical value.  Lets say that means an exceedance of SGV so remediation is required for planning purposes. [Conclusion 1]

 

However, if you follow Appendix B you have to determine if the dataset is normal once suspect values have been removed.  The only way I can see to do this, apart from just visual assessment, is to choose “normal distribution to check for outliers” in the calculator.  This procedure indicates 28 and 38 in this dataset are outliers (something that you might suspect simply by looking at the values without having to use the calculator). 

 

Now here is the interesting bit.  If you remove these two outliers (lets assume they represent part of another dataset) the remaining data are normally distributed.  According to my reading of Appendix B (3.) this means that the Grubbs Test for outliers is appropriate (ie the use of the calculator’s  “normal distribution” method which removes 28 and 38 is justified).  It also means that the one-sample t-test is used to calculate the UCL (16.2) which is less than the critical value.  Lets say this means the bulk of the site does not exceed the SGV but there are potentially 2 hotspots of 28 & 38 requiring remediation. [Conclusion 2]

 

So, if you follow the calculator instructions you arrive at Conclusion 1 but if you follow Appendix B of the guidance you arrive at Conclusion 2. 

 

Am I doing something stupid, have I missed something, or is there an inconsistency between the approaches in the guidance and the calculator? 

 

Feedback would be welcomed.

 

 

 

Regards,

Kevin Privett.

 

Dr Kevin Privett

Geo-Environmental Associate

 

Hydrock Consultants Ltd

Over Court Barns

Over Lane

Almondsbury

Bristol

BS32 4DF

 

Tel: (01454) 619533

Fax: (01454) 614125

[log in to unmask]

Cell phone: (07799) 430870

 

Offices in Bristol, Plymouth, Northampton, Stoke-on-Trent.  www.hydrock.com

 

Disclaimer


The information in this e-mail is confidential and may be read, copied or used only by the intended recipients. If you are not the intended recipient you are hereby notified that any perusal, use, distribution, copying or disclosure is strictly prohibited.  If you have received this e-mail in error please advise us immediately by return e-mail at [log in to unmask]">[log in to unmask] and delete the e-mail document without making a copy. Whilst every effort has been made to ensure this email is virus free, no responsibility is accepted for loss or damage arising from viruses or changes made to this message after it was sent.

 

Scanned by MailDefender - managed email security from intY - www.maildefender.net