I have two large data sets (2400 each). My client wants me to compare
them, ideally showing that the underlying distribution from which they
are drawn are similar, even the same.
I am aware of bioequivalence procedures which could be used for trying to
establish that the distributions are similar. However, as I see it, the
sample sizes are so large that they aren't really samples, and any formal
testing will show up differences which aren't of any importance.
Can anyone point me to places in the literature where this problem of
trying to carry out statistical procedures with very large samples is
discussed?
David Scott
_________________________________________________________________
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
Auckland NEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email: [log in to unmask]
Graduate Officer, Department of Statistics
Webmaster, New Zealand Statistical Association:
http://www.stat.auckland.ac.nz/nzsa/
|