Just spoke to Paolo who developed the
model. He thinks there is an issue that was spotted by Kevin. As he never received
a copy of the User Manual for the software he cannot verify what is the
problem. I will send him a copy and circulate the response.
Adam
--
Adam Czarnecki
Divisional Director
Clancy Consulting Ltd.
2,
Altrincham
WA14 4NX
Tel: 0161 613 6000
Fax: 0161 613 6099
Clancy Consulting Ltd.
Registered Office:
Registered in
We cannot accept any liability for any loss or damage sustained as a result of
software viruses. It is your responsibility to carry out such virus checking as
is necessary before opening any attachment.
The information contained in this message is private and confidential. It is
intended only for the use of the named E-Mail addressee. If you are not the
named E-Mail addressee please E-Mail or telephone us immediately with your confirmation
that you have destroyed it. In no event should you disclose the contents of
this E-Mail to any other person nor copy, use, print, distribute or disseminate
it or any information contained in it. Thank you for your co-operation.
Please visit our website at www.clancy.co.uk
From:
Contaminated Land Management Discussion List
[mailto:[log in to unmask]] On Behalf Of
Sent: 23 July 2008 12:28
To:
[log in to unmask]
Subject: CIEH stats - outlier test
Can anyone offer clarification of the outlier test?
Appendix B (3.) of the CIEH guidance states the
Grubb’s Test assumes the other data values in a dataset, except for the
suspect observation, are normally distributed. It tells you to check the
normality of the remaining dataset using the method in Appendix C.
Appendix B (4.) states that if the (remaining) dataset is
non-normal, consider using a log transform and check if this is normal.
However, when you use the Statistics Calculator spreadsheet
it appears to do something different.
On the outlier test sheet you have a choice of drop-down
“use normal distribution to check for outliers” or “use
log-normal distribution to check for outliers”.
This appears to be checking for outliers based on the
distribution of the whole dataset, ie including any suspect values, not on the
dataset once outliers are removed.
Here is an example: take the following dataset and assume
the critical value (SGV) is 20:-
14 |
9 |
13 |
19 |
14 |
14 |
14 |
11 |
18 |
18 |
28 |
38 |
If you follow the instructions for the stats calculator the summary
page tells you this is a non-normal dataset, so you choose “log-normal to
check for outliers” and it says there are no outliers. As it is
non-normal, the Chebychev Test is used to calculate the UCL (=27.6) which
exceeds the critical value. Lets say that means an exceedance of SGV so
remediation is required for planning purposes. [Conclusion 1]
However, if you follow Appendix B you have to determine if
the dataset is normal once suspect values have been removed. The only way
I can see to do this, apart from just visual assessment, is to choose
“normal distribution to check for outliers” in the
calculator. This procedure indicates 28 and 38 in this dataset are
outliers (something that you might suspect simply by looking at the values
without having to use the calculator).
Now here is the interesting bit. If you remove these
two outliers (lets assume they represent part of another dataset) the remaining
data are normally distributed. According to my reading of Appendix B (3.)
this means that the Grubbs Test for outliers is appropriate (ie the use of the
calculator’s “normal distribution” method which removes
28 and 38 is justified). It also means that the one-sample t-test is used
to calculate the UCL (16.2) which is less than the critical value. Lets
say this means the bulk of the site does not exceed the SGV but there are
potentially 2 hotspots of 28 & 38 requiring remediation. [Conclusion 2]
So, if you follow the calculator instructions you arrive at
Conclusion 1 but if you follow Appendix B of the guidance you arrive at
Conclusion 2.
Am I doing something stupid, have I missed something, or is
there an inconsistency between the approaches in the guidance and the
calculator?
Feedback would be welcomed.
Regards,
Dr
Geo-Environmental
Associate
Hydrock
Consultants Ltd
Over Court Barns
Over Lane
Almondsbury
BS32 4DF
Tel: (01454) 619533
Fax: (01454) 614125
Cell phone: (07799) 430870
Offices in
Disclaimer
The information in this
e-mail is confidential and may be read, copied or used only by the intended
recipients. If you are not the intended recipient you are hereby notified that
any perusal, use, distribution, copying or disclosure is strictly
prohibited. If you have received this e-mail in error please advise us
immediately by return e-mail at [log in to unmask]">[log in to unmask] and delete the e-mail document without making a copy.
Whilst every effort has been made to ensure this email is virus free, no
responsibility is accepted for loss or damage arising from viruses or changes made
to this message after it was sent.