Hi,
I'll appreciate your feedback on the following:
What would be a 'good' method for collapsing the number of levels of
categorical input variables with 'many' levels (say > 30) in a predictive
modeling context? I'm mostly referring to the situation where one does not
have prior knowledge to create those groupings, and therefore must rely on
automated methods.
I've been using decision trees so far with selection based on
cross-validation. However, I'm wondering if there is an alternative (better)
methods to handle this.
Thanks in advance,
Lars.
You may leave the list at any time by sending the command
SIGNOFF allstat
to [log in to unmask], leaving the subject line blank.
|