Dear Steve and all,
Interesting ... Looking forward to it.
Best regards,
Cinly
Stephen Smith wrote:
>Thanks Cinly, that all makes sense. In fact, the minor update of FSL that
>will be released shortly calls FLAME separately for each slice when the
>data is large, so it would be quite easy to adapt that to make 64 parallel
>calls to FLAME.
>
> Thanks, Steve.
>
>
>On Fri, 2 May 2003, Cinly Ooi wrote:
>
>
>
>>Dear Darren,
>>
>>(I was writing this email when Stephen Smith replied. The content is
>>essentially the same but I decided to send this email anyway)
>>
>>I do not write program code for for FSL, but the explaination below is
>>from my experience writing program code, and if my assumptions (listed
>>below) are correct, it came as no surprise.
>>
>>
>>First, my assumption:
>>(0)The FSL team haven't written the code for parallel processing. This
>>is important, if they had parallelize the code, then the result is
>>really a surprise! (footnote: With reference to Smith's email, I think
>>they had not parallized the code)
>>(1)You are using the FSL program as it is, i.e., no attempt to
>>parallelize the code.
>>
>>The actual answer to your question is complex, it depends on the design
>>of the supercomputer you are using.
>>
>>To answer your question:
>>from line: 64 CPUs: 98.4% idle, 1.6% usr, 0.0% ker, 0.0% wait, 0.0%
>>xbrk,
>>and the breakdown table
>>Yes, it does looks like 62 CPU are idle, and either:
>>(a) one processor working frentically (film_gl, 98.4%) while another was
>>given the meagre and insignificant job (1.6%) or
>>(b)most probably, only one processor is working, mening the 63th
>>processor is also idle.
>>
>>
>>The litmus test to me is the processing time: Roughly speaking, for a
>>truly parallelized program, you will expect the processing time to be
>>about 1/62 of that with a single processor system. (Its not 1/64 because
>>of parallization overhead which I genereously taken to be equiavalent to
>>two processors full time.)
>>
>>I suspect you find that the processing time is equiavlent to that
>>running on a single processor system with the same type of processor.
>>
>>In this case, yes, using a 64 processor system does not speed up your
>>processing as only only the equivalent of one processor will be working
>>for your at any one time.
>>
>>This is not alike the situation I have here: I have a twin processors
>>system, and for most programs, if I run only one instance of it, I
>>expect one processor to be sitting idlely.
>>
>>The reason is that FSL is not written for parallel processing. (As a
>>matter of fact, neither is BAMM nor the vanilla favour of SPM). To
>>harness all processors on a single task, in most of the case, the
>>program code must have explicit instructions on how to do it. Most of
>>the time, this requires the programmer to explicit code the
>>parallization into the program. As parallel programs are not very easy
>>to write, and that parallel computers are not that common, most
>>programmers, including me, would not had bothered. It is the problem of
>>too much work, too little benefit.
>>
>>
>>A simple solution, exactly like what I do with my twin processors
>>system, is to actually push 64 film_gl process in parallel through the
>>supercomputer. In this case, I am pretty confident that that all 64
>>processors will be working frentically for you. Having said that, why
>>not just ask your colleagues to loan you their single processor
>>computers instead of booking time for a supercomputer?
>>
>>There is another solution suggested to me by Liverpool University
>>Computing Services when I went up to MARIAC for a job interview. This is
>>applicable because FSL have a command line interface. It is more
>>difficult with SPM batch-mode but it is still possible. Assuming you
>>complete analysis requires the programs to be run in this sequence
>>ABCDEF, and each program will output data as files with unique name.
>>Then, it is a relatively simple task to put the sequence as a pipeline
>>by to have all programs listening for its input, and when all inputs
>>are available, execute the task. With the example sequence, initially
>>only program A will be processing dataset1, chunking out result1, as
>>soon as program A completes, program B will read and process result1,
>>and program A will start processing dataset2 and the process continues.
>>The idea is to allow one program occupies one CPU, and to achieve
>>parallel processing by processing the next dataset before the current
>>one is completed. However, to fully utilize 64 processors, you will need
>>64 programs. Also, as a pipeline process, the 6 processors in the
>>example will only kick in sequentially, meaning the speed of processing
>>will not be as fast as it would with a truely parallized program. This
>>means it may not be worthwhile programming this pipeline if you only
>>have 12 datasets, certainly not if you have less then 6 datasets.
>>
>>Hope this helps,
>>Cinly
>>
>>Darren Schreiber wrote:
>>
>>
>>
>>>In an attempt to speed things up, I tried using FSL on a supercomputer
>>>we have on campus. Here is a view from "top":
>>>
>>>
>>>IRIX64 inire 6.5 IP35 load averages: 1.00 0.71 0.32 03:09:35
>>>184 processes: 180 sleeping, 2 zombie, 2 running
>>>64 CPUs: 98.4% idle, 1.6% usr, 0.0% ker, 0.0% wait, 0.0% xbrk,
>>>0.0% intr
>>>Memory: 32G max, 31G avail, 20G free, 4096M swap, 4096M free swap
>>>
>>>PID PGRP USERNAME PRI SIZE RES STATE TIME WCPU% CPU%
>>>COMMAND
>>>122536 122279 dschreib 20 122M 117M run/36 2:00 11.7
>>> 99.86 film_gl
>>>122555 122555 dschreib 20 2288K 1344K run/32 0:00 0.4
>>> 0.83 top
>>>
>>>
>>>What I find interesting is that I am leaving the 64 processors 98%
>>>idle, while the CPU% used by film is 99.86%.
>>>
>>>Is this because FSL is working hard on one processor, but leaving the
>>>others inactive? Is there anything I can do here to speed things up?
>>>
>>>As it is, it looks like my happy little laptop can get the first level
>>>analyses done in about the same amount of time.
>>>
>>> Darren
>>>
>>>.
>>>
>>>
>>>
>
> Stephen M. Smith MA DPhil CEng MIEE
> Associate Director, FMRIB and Analysis Research Coordinator
>
> Oxford University Centre for Functional MRI of the Brain
> John Radcliffe Hospital, Headington, Oxford OX3 9DU, UK
> +44 (0) 1865 222726 (fax 222717)
>
> [log in to unmask] http://www.fmrib.ox.ac.uk/~steve
>
>.
>
>
>
|