Hi,
Can anyone give me some insights into how binary data is treated in the
k-means algorithm? When applying this algorithm to a dataset with Binary
data I get clusters that either have all observation from one or the other
category of the binary variable but no clusters with a mix of observations
from each category.
I could exclude this variable from the analysis and subsequently analyze the
distribution of the (binary) variable on each cluster.
Is there an alternative method to deal with Binary inputs in k-means
clustering?
Thanks in advance for any response!
Regards,
LG.
|