Karen Rosenbloom writes:
> I am asking for your advice. I have recently conducted a survey of
> attitudes toward research from a professional group. There are some
> outliers (+/- 3SD) that I would eliminate , but others conducting the
> research with me feel that this might be a minority view, and should not
> be eliminate from the dataset......any views or references that I should
> read to confirm my view, or theirs?
I've seen a lot of advice from professional statisticians, but I have never
ever seen one who would recommend automatically removing any data point more
than three standard deviations away.
There are several things you can do here. The suggestion in Altman's book
(page 130) is to present an analysis with and without the outliers. Think of
it as a sensitivity analysis. This also allows the readers to make up their
own minds about which analysis to use.
My suggestion is to investigate any outliers carefully and fix any obvious
typos. Never remove an outlier unless there is a good MEDICAL reason to do
so. Even then you might want to think twice.
Removing outliers is a deviation from the protocol and can lead to serious
biases. One possible exception is when you specify in your original protocol
how outliers will be handled (e.g., any lab result outside three standard
deviations will automatically be re-tested).
The other thing to keep in mind is that sometimes the outliers are more
interesting than the rest of the data. Think about AIDS patients who survive
much longer than their peers. Why would you want to remove these "outliers"?
These are the people you should be studying the most.
Also look for other factors that might explain the outliers. Are all the
outliers in upper management? Maybe you have a Dilbert factor going on here.
Look for any demographics that might be related to these unusual data
values. If oultliers are exclusively of a certain race or gender, you have a
great discovery that you might have missed if you just mindlessly tossed out
the data.
There are a lot of good approaches although your original inclination to
remove them without further effort is not the best choice.
I hope this helps.
Altman, Douglas G. (1991) Practical Statistics for Medical Research. London
England: Chapman and Hall. ISBN: 0-412-27630-5.
For the beginning student. A good general overview of statistics with a lot
of emphasis on practical applications.
Steve Simon, [log in to unmask], Standard Disclaimer.
STATS - Steve's Attempt to Teach Statistics: http://www.cmh.edu/stats
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|