Print

Print


I agree with the prior two responder, but would also add:

In many applications the analyst is determining what should and should
not be included in a model to be used on future data.  If that is the
case, then effect coding as pointed out by the other responders will
give parameters that may not be applicable to the next data set.

On the other hand, reference coding could also be unreliable if the
reference category is of small sample size so that its mean varies a
lot from sample to sample.  Therefore, it would be logical to pick a
large category for the base, but that brings up the next point.

Both systems will deem different variables as significant, since they
are testing for differences between different bases.  So if
statistical significance is the only criteria for inclusion or
exclusion of variables in the model then the selection of the base
will be the arbitrary determining factor for some of the differences
in the final models.  Obviously this will impact the final results.

Just my two cents.

-- 
Best regards,

David Young
Marketing and Statistical Consultant
Madrid, Spain
+34 913 540 381
http://www.linkedin.com/in/europedavidyoung