Some of the variables are spurious. Strongly urge you to use the
bootstrap to assess which variables are genuine. Or, for that matter, to
use a decision tree approach in the first place.
-------- Original Message --------
Subject: [SPAM] Variable selection in multiple regression
From: Noori Akhtar-Danesh <[log in to unmask]>
Date: Mon, June 14, 2010 1:20 pm
To: [log in to unmask]
Dear List members,
I would appreciate if I could be directed to a reference (journal
article or book) for the following approach of variable selection in
multiple regression. This approach seems to be quite common in some area
of research but I have not seen any reference for it.
Approach: When there are many explanatory variables, first each
explanatory variable is regressed individually against the outcome
(dependent) variable. Then, for each variable, if the p-value is, say
<=.20, it is chosen to be included in a multiple regression model (this
stage may be called the screening phase). Next, these selected variables
are used in a multiple regression approach to come up with a final model
(the multiple regression can be, I guess, conducted using a backward,
forward, or stepwise approach).
Many thanks in advance,
Noori
============================
Noori Akhtar-Danesh, PhD
Faculty of Health Sciences,
McMaster University,
1200 Main St. West, Room 3N28B
Hamilton, ON L8N 3Z5,CANADA
Tel: 905-525-9140 Ext. 22297 & 22725
Fax: 905-521-8834
http://www.fhs.mcmaster.ca/ceb/faculty_member_akhtar-danesh.htm
=============================
You may leave the list at any time by sending the command
SIGNOFF allstat
to [log in to unmask], leaving the subject line blank.
You may leave the list at any time by sending the command
SIGNOFF allstat
to [log in to unmask], leaving the subject line blank.
|