All the regression theory developed by statisticians over the last 200 years (related to the general linear model) is useless. Regression can be performed as accurateky without statistical models, including the computation of confidence intervals (for estimates, predicted values or regression parameters). The non-statistical approach is also more robust than the one described in all statistics textbooks and taught in all statistical courses. It does not require Map-Reduce when data is really big, nor any matrix inversion, maximum likelihood estimation, or mathematical optimization (Newton algorithm). It is indeed incredibly simple, robust, easy to interpret, and easy to code (no statistical libraries required).
This article has the following sections:
1. My solution applied to beautiful data
2. My solution applied to ugly data
3. Practical example (and metrics to compare performance)
4. Program to simulate realistic data used in this article
5. Next steps
It also comes with a spreadsheet with detailed computations, including linear regression done with Excel.
Read at http://goo.gl/mnDAER
You may leave the list at any time by sending the command
SIGNOFF allstat
to [log in to unmask], leaving the subject line blank.
|