Print

Print


Hi allstat-ers

I would like to be able to highlight "fliers" from within a small (N < 20)
(sampled) subset of (continuous) data. These "fliers" are from a known
physical process, are all in the same direction, and are not representative
of the true subset behaviour. However, there can be several / many "fliers"
per subset.

Q : Are there any papers on multiple outlier identification ?

 I am familiar with Grubbs' method (which is OK at spotting a single
point).

 I am also familiar with the idea of 'cross-validation' - where the
"t"-value for each point is formed
 using the mean & sd of the _other_ (N-1) subset data points. This
type of analysis is already widely
 used for assessing the influence of data points in regression models
(this is "PRESS" in SAS).

Has either piece of work been extended for multiple "flier" cases, so that
groups of "fliers" from within the subgroup can be highlighted together?


TIA

Dan B


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%