On Thu, 8 May 2003, Emil M Friedman wrote:
> It make perfect sense IF the scale of the X's is representative of how much
> they vary in "real life".
Except for Sod's law. One paper I have quoted for the past twenty years
is "Six statistical tales" by W.G.Hunter in The Statistician (V30 No 2,
June 1981). The first tale described a new PhD given data to analyse on
the performance of a chemical plant. He ran the regressions, ranked the
input variables in order of importance, and was asked to give a
presentation. Everything went well, until he reached the bottom item of
his final chart. He said, "And of course, the least important variable is
the amount of water present." The audience burst into laughter.
What everyone but he knew, was that the amount of water present was far
from being the least important variable, because if ANY water were allowed
to enter this particular process, the plant would explode. For reasons of
safety therefore, the water level was controlled as near zero as possible
and not allowed to vary.
You'll have to obtain the paper for the other five tales
--------------------------------------------------------
Related to the original question is another paper in the same edition:
"Investigation of alternative regressions" by J.N.R.Jeffers. Jeffers
ascribes the method to Hawkins (Applied Statistics, 22, 275-86. 1973).
The method is to apply principal components analysis to the correlation
matrix formed from the dependent and all possible predictor variables,
then varimax rotation to simplify the structure. If one row remains with
a high loading on the dependent variable, that row will describe the
optimal prediction and the weightings for each predictor will indicate
their "importance". Moreover, the method also indicates sub-regressions
within the predictors, and hence allows substitution of predictors that
are less convenient or more costly to measure. I am not aware of this
procedure being implemented or promoted in a software package, though
there are later papers apparently using the technique. As Jeffers wrote
in 1981, the method "deserves to be better known."
A contemporary reference is:
Title: Principal Components Regression With Data Chosen Components and
Related Methods
Author(s): J. T. Gene Hwang ; Dan Nettleton
Source: Technometrics Volume: 45 Number: 1 Page: 70 -- 79
Abstract: Multiple regression with correlated explanatory variables is
relevant to a broad range of problems in the physical, chemical, and
engineering sciences. Chemometricians in particular have made heavy use of
principal components regression and related procedures for predicting a
response variable from a large number of highly correlated variables. In
this article we develop a general theory for selecting principal
components that yield estimates of regression coefficients with low mean
squared error. ...
R. Allan Reese Email: [log in to unmask]
Associate Manager GRI Direct voice: +44 1482 466845
Graduate School Voice messages: +44 1482 466844
Hull University, Hull HU6 7RX, UK. Fax: +44 1482 466436
====================================================================
|