Scenario: We have approximately 200 data points collected over a long
period of time. Data is right skewed and there is no rational sub-grouping.
Purpose: To establish if process is stable.
Analysis: To plot data on an Individuals and Moving Range chart. (I am
aware that data does not need to be normally distributed to use control
charts, ref; Shewhart's original work and current work by Don Wheeler).
Traditionally we would estimate sigma from moving average or median ranges
from first 25 data points and then calculate 3 sigma control limits. If any
points were out of control then we would remove these points and then
re-estimate sigma and the limits, and then use the control chart in process.
An alternative view is to estimate sigma from all the data and calculate the
limits. The resulting control chart shows the process is out of control, or
is this a false warning? Nest step is to use a suitable Box-Cox lambda
transform to achieve normality and recalculate sigma, draw the charts and
now the process appears stable.
My question relates to how many sub-groups (values) in this case are
required to estimate sigma?
For x-bar / range charts it is commonly assumed to be at least 20 sub-groups
of size 5 or at least enough shifts / days of operation to reflect "normal"
process variation. For individuals and moving range charts the traditional
approach is 25 values and then proceed as above.
Are there any formal rules? Indeed does it really matter so long as we are
improving the process by taking actions on out of control points? Or, do we
ignore skewness by taking a transformation and then assume everything is
Has any research been done in this area please?