I am building confidence intervals for regression coefficients using methods
similar to the boostrap. The data set has 100 observations. I create 500 new
samples based on the origibal dataset. I am wondering what would be the best
method to produce the new samples. Here's the methods I am thinking of, to
create one new sample:
Method #1
delete 50 obs randomly selected
duplicate 50 obs randomly selected
(I suspect it will produce CI that are too narrow)
Method #2
Perform 100 swappings as follows: swap y_i with y_j (dependent variable)
where i and j are randomly selected with the
constraint that
-3 < rank(y_i) - rank(y_j) < 3
Method #3
Perform a transformation of the y_i that preserves ranks (e.g. y'_i = y_i +
random deviate, where random deviate is small enough to preserve ranks)
Method #4
take a subsample of size 50 (so delete 50 observations)
(I suspect it will produce too large confidence intervals)
Method #5
take a subsample of size 99 (so delete one observation)
Which one would you recommend? Which one would produce CI similar to those
produced with the traditional Gaussian model?
--
Vincent Granville, Ph.D.
Strategy Architect & Founder
Data Shaping Solutions, LLC
http://www.datashaping.com
|