I hope you don't mind me regurgitating an old email message
(http://www.mailbase.ac.uk/lists/spm/1999-10/0053.html)
With PCA, you generally want to decompose your data matrix using singular
value decomposition, such that A = U*S*V', where S is diagonal and U and
V are unitary. If the height of A is greater than the width, then
this is done in Matlab by:
[U,S,V]=svd(A,0);
However, with very large datasets, it is not possible to store everything
in memory, so things need to be done a bit at a time. If a dataset consists
of n volumes, each containing m voxels, then it can be represented by the
m*n matrix A. Assuming that it is possible to store an n*n matrix in memory,
then matrices S and V can easily be derived from the matrix generated by
A'*A;
C = A'*A;
[tmp,D,V] = svd(C);
% or [V,D] = eig(C), but the eigenvalues are not sorted by
% decreasing magnitude if you do it this way.
S = diag(sqrt(diag(D))); % S = sqrtm(D);
The matrix C (above) can be computed a bit at a time (eg voxel by voxel
or plane by plane) using a scheme something like:
C = zeros(size(A,2));
for i=1:size(A,1),
C = C + A(i,:)'*A(i,:);
end;
If you want the matrix U (referred to in SPM as the eigenimages), then this
can be computed from A, V and S by:
U = A*V*diag(diag(S).^(-1));
If you just want one eigenimage at a time (one column of U), then these can
be generated by:
U = zeros(size(A));
for j=1:size(A,2),
U(:,j) = A*V(:,j)/S(j,j);
end;
or:
U = zeros(size(A));
for j=1:size(A,2),
for i=1:size(A,2),
U(:,j) = U(:,j) + A(:,i)*V(i,j)/S(j,j);
end;
end;
All the best,
-John
| Date: Mon, 08 Nov 1999 18:13:05 +0100
| X-Accept-Language: en
| MIME-Version: 1.0
| Subject: matlab, pca and memory
| From: Kaspar Schindler <[log in to unmask]>
| To: spm mail <[log in to unmask]>
| X-List: [log in to unmask]
| X-Unsub: To leave, send text 'leave spm' to [log in to unmask]
| X-List-Unsubscribe: <mailto:[log in to unmask]>
|
| Dear SPMers,
|
| My questions are only related to SPM, but I am sure some of you have
| encountered (and hopefully solved) similar problems, too, when using SPM
| - so, here we go:
| I am writing a Matlab routine to do principal component analysis on
| SPECT scans of epileptic patients. Now, when I read in the scans of lets
| say 30 subjects, each scan consisting of 128x128x40 voxels and then
| calculate the correlation matrix...the thing just gets HUGE and my
| computers out of memory. So, I went through the literature and found a
| nice algorithm created by Dr. Friston (J Cereb Blood Flow and Metab,
| 1993), but if I implement it, Matlab exceeds the allowed recursion depth
| and crashes.
| I am working with a AMD K7 600 MHz machine (128 MByte RAM), and a
| MacIntosh G3 333 MHz (350 MByte RAM) ...and both cannot handle the job.
| Does anybody know a way to implement a memory saving pca algorithm...or
| how much RAM I would need to handle such high-dimensional matrices?
| I would warmly welcome any hints and advices.
|
| Sincerely,
| Kaspar Schindler, MD, PhD
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|