Hello everyone, Many thanks to all those who replied to my email last week (see below). I greatly appreciate it. The most suitable method is 'multiple correspondence analysis'. I found the following references particularly useful: Correspondence Analysis in Practice (2007). Chapman & Hall Biplots in Practice (2010) (Michael Greenacre) available for free online at www.multivariatestatistics.org And the following links: http://www.statsoft.com/textbook/correspondence-analysis/ marketing-bulletin.massey.ac.nz/V14/MB_V14_T2_Bendixen.pdf http://www.utd.edu/~herve/Abdi-MCA2007-pretty.pdf The most useful packages that I have found for analysis and plotting are: SAS (for multiple correspondence analysis: proc CORRESP) and Minitab (for simple correspondence analysis). Thanks again to everyone, Kindest Regards, Kim -----Original Message----- From: Kim Pearce Sent: 20 September 2011 11:05 To: [log in to unmask] Subject: PCA categorical variables Hello everyone, I would appreciate your views on the following... For a Principal Component Analysis, we have N subjects and p variables. Say one of our variables is categorical (nominal) with categories corresponding to either 'yes', 'no', 'don't know' or 'confidential'. Would the 4 categories be entered into the PCA as 3 dummy binary variables...i.e. x1, x2 and x3 coded, perhaps, like so: x1 x2 x3 yes 1 0 0 no 0 1 0 confidential 0 0 1 don't know 0 0 0 (where, here, 'don't know' is the reference category) i.e. just as in regression, a q category variable is entered as q-1 dummy variables. Thanks so much for your views, Kim You may leave the list at any time by sending the command SIGNOFF allstat to [log in to unmask], leaving the subject line blank.