Dear List,
I'm trying to reproduce SAS VARCLUS procedure to cluster variables in R. I
came across this situation where I'll appreciate your thoughts.
The SAS 9.2 VARCLUS PROCEDURE documentation (Chapter 93) states that:
step 1) A cluster is chosen for splitting. Depending on the options
specified, the selected cluster has either the smallest percentage of
variation explained by its cluster component (using the PROPORTION= option)
or the largest eigenvalue associated with the second principal component
(using the MAXEIGEN= option).
step 2) The chosen cluster is split into two clusters by finding the first
two principal components, performing an orthoblique rotation (raw quartimax
rotation on the eigenvectors; Harris and Kaiser 1964), and assigning each
variable to the rotated component with which it has the higher squared
correlation.
My question is: suppose I have 2 variables in a cluster, and this cluster
satisfies the criteria for splitting (e.g., this cluster has the smallest
percentage of variation explained using the proportion = option). In
addition, suppose that after performing step 2 above, I find that the
squared correlation between each variable and each rotated component is
highest for the same component. My understanding is that if, say, the first
variable has highest squared correlation with one component and the second
variable has highest squared correlation with the other component, then this
cluster would be split in 2, but what happens when the squared correlation
is highest for the same component.
The case with 2 variables may be easy to solve, but my question extends to
the case of 3 or more variables. Again, what happens if a cluster with 3
variables satisfy the criteria for splitting, but the highest squared
correlation between these variables occurs for the same component. How does
the procedure 'decide' which variables go to each cluster?
Thanks in advance for any hint on this.
Lars.
You may leave the list at any time by sending the command
SIGNOFF allstat
to [log in to unmask], leaving the subject line blank.
|