Print

Print


Hello everyone,

Many thanks to all those who replied to my email last week (see below).  I greatly appreciate it.

The most suitable method is 'multiple correspondence analysis'.  I found the following references particularly useful:

Correspondence Analysis in Practice (2007). Chapman & Hall 
Biplots in Practice (2010) (Michael Greenacre) available for free online at www.multivariatestatistics.org

And the following links:

http://www.statsoft.com/textbook/correspondence-analysis/

marketing-bulletin.massey.ac.nz/V14/MB_V14_T2_Bendixen.pdf

http://www.utd.edu/~herve/Abdi-MCA2007-pretty.pdf

The most useful packages that I have found for analysis and plotting are:

SAS (for multiple correspondence analysis:  proc CORRESP) and
Minitab (for simple correspondence analysis).

Thanks again to everyone,
Kindest Regards,
Kim



-----Original Message-----
From: Kim Pearce 
Sent: 20 September 2011 11:05
To: [log in to unmask]
Subject: PCA categorical variables

Hello everyone,

I would appreciate your views on the following...

For a Principal Component Analysis, we have N subjects and p variables.  Say one of our variables is categorical (nominal) with categories corresponding to either 'yes', 'no', 'don't know' or 'confidential'.  Would the 4 categories be entered into the PCA  as 3 dummy binary variables...i.e. x1, x2 and x3 coded, perhaps, like so:

	 		x1	x2	x3
yes			1	0	0
no			0	1	0
confidential	0	0	1
don't know		0	0	0


(where, here, 'don't know' is the reference category)  i.e. just as in regression, a q category variable is entered as q-1 dummy variables.

Thanks so much for your views,
Kim

You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.