Excellent food for thought. I am happy you are still on the list, Paul.
Have a nice weekend all of you (and enjoy your Easter holidays)
Jason
Department of Social and Political Sciences
University of Cyprus
----- Original Message Follows -----
From: Paul Barrett <[log in to unmask]>
To: [log in to unmask]
Subject: Stepwise regression? Complete junk.
Date: Thu, 5 Apr 2012 10:18:00 +1200
> For those who use regression analysis in their work and studies ...
>
>
>
> My attention was brought to a very fine article:
>
>
>
> Armstrong, J.S., Graefe, A. (2010) Predicting elections from biographical
> information about candidates. Presented at the Symposium on Leadership and
> Individual Differences, Lausanne, Switzerland, November 30 - December 1,
> 2009, , , 1-20.
> (http://marketing.wharton.upenn.edu/documents/research/PollyBio58.pdf )
>
> Abstract
>
> Traditional election forecasting models are estimated from time-series data
> on relevant variables and that limits the type and number of variables that
> can be used. Index models do not suffer from the same restrictions. We used
> as many as 60 biographical variables to create an index model for
> forecasting U.S. Presidential Elections. For each candidate, we simply
> counted the number of variables for which the candidate was rated favorably.
> The index model forecast was that candidate A would win the popular vote if
> he had a higher index score than candidate B. We used simple linear
> regression to estimate a relationship between the index score of the
> candidate of the incumbent party and his share of the popular vote. We
> tested the model for the 29 U.S. presidential elections from 1896 to 2008.
> The model’s forecasts, calculated by cross-validation, correctly predicted
> the popular vote winner for 27 of the 29 elections and were more accurate
> than those from polls (15 out of 19), prediction markets (22 out of 26), and
> three regression models (12 to 13 out of 15 to 16). Out-of-sample forecasts
> of the two-party popular vote shares were more accurate for the last four
> elections from 1996 to 2008 than those from seven prominent regression
> models. By relying on different information and including more variables
> than traditional models, the biographical index model can improve the
> accuracy of long-term election forecasting. In addition, it can help parties
> to select the candidates running for office.
>
>
>
> I think the statement from Armstrong and Graefe – p. 14 ... is the problem
> for many .. “Summarizing evidence from the literature, Hogarth (2006) showed
> that people exhibit a resistance to simple solutions. Although there is
> evidence that simple models can outperform more complicated ones, there is a
> belief that complex methods are necessary to solve complex problems.”
>
>
>
> This backed up my own work – and that of Marnie Rice and colleagues in
> forensic risk assessment (see
> http://www.pbarrett.net/stratpapers/critical_outcome_tests.pdf) for details
> of each.
>
>
>
> I dug a little deeper and found ...
>
>
>
> This is also the message in the last paragraph of: Goldstein, D.G., &
> Gigerenzer, G. (2009) Fast and frugal forecasting. International Journal of
> Forecasting, 25, , 760-772.
>
> “The danger is that complex methods become an end in themselves, a ritual to
> impress others, and at the same time opportunities to learn how to do things
> better are missed. Learning requires some form of transparency, which
> forecasters can best achieve when they understand what they are doing. ”
>
>
>
> There is also a suite of relevant papers “in press” in the International
> Journal of Forecasting .. the short 4 page article by Keith Ord is
> “illuminating” on stepwise methods!
>
>
>
> Soyer, E., Hogarth, R.M. (2012) The illusion of predictability: How
> regression statistics mislead experts. International Journal of Forecasting
> (In Press), , , 1-39.
>
> ( http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1996568)
>
>
>
> Armstrong, J.S. (2012) Illusions in regression analysis. International
> Journal of Forecasting (In Press), , , 1-10.
>
> (http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1969740)
>
>
>
> Ord, K. (2012) The illusion of predictability: A call to action.
> International Journal of Forecasting (In Press), , , 1-4.
>
> (http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2016195 )
>
>
>
> Look at the results from stepwise methods on random data ... (from Ord)
>
> “101 sets of 30 random numbers were generated from a normal distribution
> with mean 0 and variance 1. One variable was treated as the “dependent
> variable” and the other 100 as the predictor variables. The results of
> various regression analyses are summarized in the table.”
>
>
>
> cid:image001.png@01CD1313.D47A65B0
>
>
>
> “The conclusions are obvious enough: dredge deeply enough and you will come
> up with some amazing results. What should be done to improve the situation?
> ..”
>
>
>
> There is a lot of really sobering information here for those who teach
> methods classes ...
>
>
>
> I now reject all papers outright which use stepwise methods, unless very,
> very serious attempts at robust cross-validation are made.
>
>
>
> Of more concern is the use of regression methods per se, and especially the
> mistaken application of confidence vs prediction intervals.
>
>
>
> Regards .. Paul
>
>
>
> Advanced Projects R&D Ltd.
>
> ____________________________________________________________________________
> ______
>
> W: <http://www.pbarrett.net/> www.pbarrett.net
>
> E: <mailto:[log in to unmask]> [log in to unmask]
>
> M: +64-(0)21-415625
>
>
>
>
>
>
>
> [Attachment: image001.png]
|