Puliyel wrote:
> I wonder if any one in the list can please help me with this
> bootstrapping problem. I have raw data in sequence of failures and
> success with treatment. There were 5 failures and 30 successes. The
> sequence of failures (F) and success (S) were as follows
> SSSSSSSSSSSSSSSSSSSFSSSSSFSSFSSFFSS For CUSUM calculations each
> success gets a score of +2/7 and each failure gets a score of -12/7
> *Using bootstrapping techniques (reordering the sequence 1000 times)
> I want to calculate the 95% confidence limits for CUSUM*
>
> I am not able to get the software I downloaded on to Excel, to
> provide me the confidence limits for this data using CUSUM I will
> appreciate help from anyone familiar with this tool
This is a rather technical question and I've prepared a very technical
response. I apologize to those who don't like to see messy statistics in
action.
If you are going to make any progress on this project, you must part
ways with Microsoft Excel. Excel is fine for very simple statistics, but
running a sophisticated statistical analysis using Excel is like towing
a two ton trailer with a Smart car. There's been a lot written about
this, and I've added a few references at
http://www.pmean.com/category/StatisticalComputing.html
Note in particular, the works by Heiser, Cryer, and Burns.
Your goal, I presume, is to show that the sequence of successes and
failures is not random, but rather that failures are coming more
frequently in the most recent data. There is no justification for using
Excel here. A bootstrap for the CUSUM plot could be contructed easily in
any statistical package, and one particularly good package, R, is free
and open source.
There are several ways to calculate a CUSUM chart, so without further
details, I can only offer a very vague and general outline of how to do
it in R.
1. Write a function, cusum, that takes a vector of zeros and ones and
computes a single summary measure.
cusum <- function(v) {...
2. Code your actual data as zeros and ones.
x <- c(0,0,...
3. Use the sample function to reorder the data randomly, calculate the
cusum function on the reordered data, and repeat one thousand times.
for (i in 1:1000) {bo[i] <- cusum(sample(x))}
4. Calculate a percentile of these values.
cutoff <- quantile(bo,probs=0.05)
5. Compare this value to the actual value from the regular data
stat <- cusum(x)
I can prepare the full R code for free if I can use your example on my
webpage. If you want this, just let send me the details on how you
calculate a CUSUM chart.
The CUSUM is a nice approach for your type of data, but you should also
consider alternative statistics, such as the runs test. I have been
working on some ideas involving the continuous monitoring of Number
Needed to Harm to track adverse events in clinical trials. It involves
tracking the waiting time between successive failures. In your series:
SSSSS SSSSS SSSSS SSSSF SSSSS FSSFS SFFSS
you wait 20 trials before the first failure, 6 before the second
failure, 3 before the third failure, 3 before the fourth failure, and 1
before the fifth failure. That's a pretty convincing downward trend. The
ideas behind this are not fully developed, but you can look at some of
my work in this area at
http://www.pmean.com/category/AdverseEvents.html
Best of luck.
--
Steve Simon, Standard Disclaimer
Free statistics webinar, Wed, Oct 14, 10am CDT.
"P-values, confidence intervals, and the Bayesian alternative"
Details at www.pmean.com/webinars
|