I have 47 observations and nearly the same number of explanatory
variables. To get started in the process, I want to model in SAS the
response Y using both forward and backward selection methods. Going
forward, the selection finds one "significant" explanatory and quits.
Going backward, the selection finds 33 "significant" variables, finding
importance in every nuance of the data set, it seems. My initial
attempts at subsetting these significant variables (looking at p-values,
VIF's)) have thus far been unsuccessful.
This is not an important model, it will not be put to use, yet at the
same time I want to find some reasonably explanatory variables out of
the 45 or so. Note that the variables cannot be judged differently, so
subjective (using non-statistical reasoning) modeling does not apply to
the usual extent here. I figured an automated selection would help get
me started, but I am not sure where to go from here, given my time
constraints.
I would like to find a model with 5-10 explanatory variables, optimally.
|