Print

Print


Hi All,

I have a question with regards to an outlier filtering algorithm for
extremely skewed data. I am currently using the following filter limits

1) ( Q1-1.5*IQR, Q3+1.5*IQR )

where Q1 = 25th percentile, Q2=50th percentile=median, Q3=75th percentile.

As I understand this is the standard formula on Box and Whisker plots to
flag outliers. It works fine as long as my data is not too skewed. I have
data with lots of zeroes and long right tails. If Q1=Q2=Q3=0 I have a
problem!

I have tried the approach of applying a power transformation to the data
such as log(1+x) and computing the outlier filter in the transformed space
and then back transforming. This helps, but I still have situations in the
transformed space where Q1=Q2=Q3=0!

Are there other transformations I can use or alternative methods?

I know there is no substitute for plotting the data and visually sanity
checking the data but it is not possible in the case.

Thanks in advance

Regards,

Richard