Interested in PCA+clustering on gene expression data? Appealing.
Difficult to resist the temptation to recommend the following article:
"Principal Component Analysis for clustering gene expression data" by Ka
Yee Yeung and Walter L Russo in Bioinformatics, 17(9):763-774 2001
(hoping the ref. is correct). You might also be interested by other
articles on the first author's website dealing with model based
clustering (for gene expression datasets). Though 'old', these papers
are still fundamental (personal opinion) and will help retrieve more
recent works, books, discussions, development...on the subject.
As mentioned in another response a simple PubMed search or even Google
leads to this kind of references...
> This is a second attempt to send this mail. I apologize if you get it twice.
> I'm trying to find patterns that differentiate between two groups of
> animals using genetic expression data  of training set animals, in
> order to be able to predict to which group validation set animals
> belong. Gene expression data consists of numerical values that
> represent the expression of about 100 genes. The training set consists
> of 50 animals from group 1 and 30 animals from group 2. In order to
> reduce the complexity, I performed Principle Component Analysis (PCA)
> and have found that three components contribute to approx. 85% of the
> overall variance. I have two questions.
> 1. Is it possible to estimate the weights of each of the genes in
> these three components? This is important, so that in the future
> expression patterns of only the most important genes are sampled
> 2. I still can't differentiate between the two groups. Is it possible
> to perform logistic regression using the PCA components as the
> independent variables? Is this preferable over logistic regression
> using the expression data itself?
> Can you suggest good reading material (online, peer-reviewed journals)
> that reviews clustering of gene expression data?
> Thank you very much