Do you mean that all of the variables are two-valued (dichotomous)?
Or do you mean that some of them are?
If the former why not use something like SPSS's CLUSTER which can handle
many kinds of similarity coefficients?
If you have some multiple dichotomies to represent many-valued
(polychotomous) nominal variables and some continuous variables,
you might try SPSS's new two-step clustering.
How many cases do you want to cluster?
Are the variables something like test items?
When doing clustering, the usual practice to use a variety of
coefficients and algorithms and to look for a convergence among the results.
Since clustering is an exploratory procedure, it wouldn't hurt to use
k-means as one of the algorithms you try. Interpretation may be a little
difficult.
Depending on the meaning of your variables and cases, you may want to
use data reduction like principal components or other forms of factor
analysis first. Then you will have continuous variables.
Hope this helps.
Art
[log in to unmask]
Social Research Consultants
University Park, MD USA
jennygeorge wrote:
> Hi,
>
> Could someone advise me please as to whether it is acceptable to use K means clustering
on a dataset that contains binary variables. I did not think this was
appropriate, as the distance measure is the Euclidean distance measure
for continuous data, but I have recently seen examples of this.
>
> Thanks
>
> Jenny
>
|