Thanks Cinly, that all makes sense. In fact, the minor update of FSL that
will be released shortly calls FLAME separately for each slice when the
data is large, so it would be quite easy to adapt that to make 64 parallel
calls to FLAME.
Thanks, Steve.
On Fri, 2 May 2003, Cinly Ooi wrote:
> Dear Darren,
>
> (I was writing this email when Stephen Smith replied. The content is
> essentially the same but I decided to send this email anyway)
>
> I do not write program code for for FSL, but the explaination below is
> from my experience writing program code, and if my assumptions (listed
> below) are correct, it came as no surprise.
>
>
> First, my assumption:
> (0)The FSL team haven't written the code for parallel processing. This
> is important, if they had parallelize the code, then the result is
> really a surprise! (footnote: With reference to Smith's email, I think
> they had not parallized the code)
> (1)You are using the FSL program as it is, i.e., no attempt to
> parallelize the code.
>
> The actual answer to your question is complex, it depends on the design
> of the supercomputer you are using.
>
> To answer your question:
> from line: 64 CPUs: 98.4% idle, 1.6% usr, 0.0% ker, 0.0% wait, 0.0%
> xbrk,
> and the breakdown table
> Yes, it does looks like 62 CPU are idle, and either:
> (a) one processor working frentically (film_gl, 98.4%) while another was
> given the meagre and insignificant job (1.6%) or
> (b)most probably, only one processor is working, mening the 63th
> processor is also idle.
>
>
> The litmus test to me is the processing time: Roughly speaking, for a
> truly parallelized program, you will expect the processing time to be
> about 1/62 of that with a single processor system. (Its not 1/64 because
> of parallization overhead which I genereously taken to be equiavalent to
> two processors full time.)
>
> I suspect you find that the processing time is equiavlent to that
> running on a single processor system with the same type of processor.
>
> In this case, yes, using a 64 processor system does not speed up your
> processing as only only the equivalent of one processor will be working
> for your at any one time.
>
> This is not alike the situation I have here: I have a twin processors
> system, and for most programs, if I run only one instance of it, I
> expect one processor to be sitting idlely.
>
> The reason is that FSL is not written for parallel processing. (As a
> matter of fact, neither is BAMM nor the vanilla favour of SPM). To
> harness all processors on a single task, in most of the case, the
> program code must have explicit instructions on how to do it. Most of
> the time, this requires the programmer to explicit code the
> parallization into the program. As parallel programs are not very easy
> to write, and that parallel computers are not that common, most
> programmers, including me, would not had bothered. It is the problem of
> too much work, too little benefit.
>
>
> A simple solution, exactly like what I do with my twin processors
> system, is to actually push 64 film_gl process in parallel through the
> supercomputer. In this case, I am pretty confident that that all 64
> processors will be working frentically for you. Having said that, why
> not just ask your colleagues to loan you their single processor
> computers instead of booking time for a supercomputer?
>
> There is another solution suggested to me by Liverpool University
> Computing Services when I went up to MARIAC for a job interview. This is
> applicable because FSL have a command line interface. It is more
> difficult with SPM batch-mode but it is still possible. Assuming you
> complete analysis requires the programs to be run in this sequence
> ABCDEF, and each program will output data as files with unique name.
> Then, it is a relatively simple task to put the sequence as a pipeline
> by to have all programs listening for its input, and when all inputs
> are available, execute the task. With the example sequence, initially
> only program A will be processing dataset1, chunking out result1, as
> soon as program A completes, program B will read and process result1,
> and program A will start processing dataset2 and the process continues.
> The idea is to allow one program occupies one CPU, and to achieve
> parallel processing by processing the next dataset before the current
> one is completed. However, to fully utilize 64 processors, you will need
> 64 programs. Also, as a pipeline process, the 6 processors in the
> example will only kick in sequentially, meaning the speed of processing
> will not be as fast as it would with a truely parallized program. This
> means it may not be worthwhile programming this pipeline if you only
> have 12 datasets, certainly not if you have less then 6 datasets.
>
> Hope this helps,
> Cinly
>
> Darren Schreiber wrote:
>
> > In an attempt to speed things up, I tried using FSL on a supercomputer
> > we have on campus. Here is a view from "top":
> >
> >
> > IRIX64 inire 6.5 IP35 load averages: 1.00 0.71 0.32 03:09:35
> > 184 processes: 180 sleeping, 2 zombie, 2 running
> > 64 CPUs: 98.4% idle, 1.6% usr, 0.0% ker, 0.0% wait, 0.0% xbrk,
> > 0.0% intr
> > Memory: 32G max, 31G avail, 20G free, 4096M swap, 4096M free swap
> >
> > PID PGRP USERNAME PRI SIZE RES STATE TIME WCPU% CPU%
> > COMMAND
> > 122536 122279 dschreib 20 122M 117M run/36 2:00 11.7
> > 99.86 film_gl
> > 122555 122555 dschreib 20 2288K 1344K run/32 0:00 0.4
> > 0.83 top
> >
> >
> > What I find interesting is that I am leaving the 64 processors 98%
> > idle, while the CPU% used by film is 99.86%.
> >
> > Is this because FSL is working hard on one processor, but leaving the
> > others inactive? Is there anything I can do here to speed things up?
> >
> > As it is, it looks like my happy little laptop can get the first level
> > analyses done in about the same amount of time.
> >
> > Darren
> >
> > .
> >
>
Stephen M. Smith MA DPhil CEng MIEE
Associate Director, FMRIB and Analysis Research Coordinator
Oxford University Centre for Functional MRI of the Brain
John Radcliffe Hospital, Headington, Oxford OX3 9DU, UK
+44 (0) 1865 222726 (fax 222717)
[log in to unmask] http://www.fmrib.ox.ac.uk/~steve
|