Dear all,
I have a question concerning data reduction in an extremely large dataset:
the data set has more than 7.000 variables (here, categorical variables are
dummy-coded) and ~30.000 observations. We would like to reduce the dimension
of the data and thus need to find the ~200 key-variables which explain the
largest part of the variance in the data.
We already tried factor analysis in spss and clementine -> too many
variables, the programs crash.
Does anybody have an idea of
- a very performant statistical software which is able to do factor analysis
for such a huge data set
- how we might reduce the dimension of the data without coding the
categorical variables as indicators
Thank you very much in advance!
Kind regards,
Ursula
--
Geschenkt: 3 Monate GMX ProMail + 3 Top-Spielfilme auf DVD
++ Jetzt kostenlos testen http://www.gmx.net/de/go/mail ++
|