Print

Print


Hi Derek,

these are different types of parallelisations. bedpostx, randomise_parallel etc are all solving problems that are essentially a large number of 1D problems. Hence it is relatively straightforward to divide it up into slices (or whichever chunk one prefers) and then send these to different nodes.

Eddy on the other hand is a 4D problem, and hence cannot be divided up like that. Therefore it uses OpenMP instead, which means that all threads have to reside on the same node and have access to the same heap/RAM. The work will be divided up over all the cores on the node it is submitted to, hence speeding things up. BUT, if you submit several eddy jobs on the same node they will all compete for the cores on that node and it will be no faster than if you had submitted the eddy jobs sequentially.

If you want to limit the number of cores that an OpenMP job can bogart you can set the environment variable OMP_NUM_THREADS to a suitable number. So for example if you set OMP_NUM_THREADS to 4, you could have six eddy jobs happily using 4 cores each on your 24 core machine. But unless you have sufficient RAM to run all of those eddy processes at the same time it might still be a better idea to run the sequentially, letting each job use all 24 cores.

Jesper

On 31 Aug 2015, at 17:25, Derek Pisner <[log in to unmask]> wrote:

> The Eddy documentation says that eddy has been configured to utilize openMP, but when I try to submit eddy as a parallel job using SGE, the job always sends to one queue on one node and overloads all of my cpu's (I have a 24 core machine). Is there currently a beta eddy_parallel command floating around somewhere? Or does fsl_sub need to be modified in some further way to allow Eddy to round_robin across multiple nodes like bedpostx, randomize_parallel, and other fsl parallelization functions do?
> 
> Thanks,
> Derek