Dear SPM-list,
I am using spm_get_data in some code to read out
the time-courses from particular voxels.
This function gets called a lot, as I am doing this
for all of the voxels in a volume.
I noticed that the code was taking a long time to run,
so I ran the Matlab profiler, and I found to my surprise
that 90% of the CPU time was getting taken up by
just one single line in spm_get_data:
if exist(V(i).fname,'file') ~=2
In SPM5_Updates_1782, this is line 34 of spm_get_data.
The whole of that if-statement is this:
% check files exists, if not try pwd
%----------------------------------------------------------------------
if exist(V(i).fname,'file') ~=2
[p,n,e] = fileparts(V(i).fname);
V(i).fname = [n e];
end
From "help exist":
EXIST('A') returns:
0 if A does not exist
1 if A is a variable in the workspace
2 if A is an M-file on MATLAB's search path. It also returns 2 when
A is the full pathname to a file or when A is the name of an
ordinary file on MATLAB's search path
In the spm_get_data code, I'm pretty sure that
what is intended is simply to check whether V(i) is a full valid
pathname and filename. I don't think that the intention
is for each and every brain volume that is loaded by spm_get_data,
to get searched for in the *entire matlab search path*,
but I suspect that this is what is happening.
When I commented out that if-then clause,
my code ran ten times faster!
Here's the output on my machine of a simple test.
To try it yourself, go into the analysis directory
of any individual subject.
>> load SPM.mat
>> XYZ = [ 30 30 10 ]'; % Some arbitrary voxel in the brain
>> tic;for i=1:100,y = spm_get_data(SPM.xY.VY,XYZ);end;toc
Elapsed time is 47.160028 seconds.
>> %%% Now comment out lines 34-37 of spm_get_data
>> tic;for i=1:100,y = spm_get_data(SPM.xY.VY,XYZ);end;toc
Elapsed time is 3.953190 seconds.
This is in Matlab Version 7.4.0.287 (R2007a) on Mac OS X 10.5.5,
by the way. Your mileage may vary.
Of course, it's good to have some kind of check
that the file is there, but I'm sure that there are
much more efficient ways of doing it,
especially if it is indeed the case that Matlab
ends up searching its entire path-tree for every volume.
Given that spm_get_data is pretty central to a lot
of SPM functions, removing or replacing this time-hog
"if exist" line might have be quite useful.
It certainly speeded up my code by a huge amount.
This is just a suggestion, and it's possible that what
seems to me like an unnecessary if-exist check
might actually be much less dispensable than it looks.
I'd be interested to hear what people think.
Raj
|