In general, FSL uses SGE to split tasks into different jobs, and send them to different cores.
bedpostx_gpu does the same using SGE. It can split a dataset into different parts and send each one to a different  GPU.
For doing that you will need to create a specific queue for GPU hobs, with as many slots as GPUs (2 in your case).

Here you can read more about FSL and SGE:
https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FslSge

I think Son of Grid Engine is free, so you do not need to order it, but you will need to install it:
https://arc.liv.ac.uk/trac/SGE

Moises.

On 16 February 2018 at 06:56, Wu chen <[log in to unmask]> wrote:
Dear Moises,

I have one quick follow-up question.

I have read about multi GPU and it requires SGE. Does SGE comes with GPU or I have to order it with multi GPU machine? LIke you suggested, we are thinking to order two of the tesla Family GPUs but I am not sure about SGE. Could you tell me if it comes with CentOS?

Many thanks once again.



On 16 February 2018 at 12:47, Wu chen <[log in to unmask]> wrote:
Thanks alot Moises for your valuable comments and your precious time. I really appreciate it.

Best, Wu

On 15 February 2018 at 16:54, Moises Hernandez <[log in to unmask]> wrote:
Hi Wu.

1) Do you think above mentioned CPUs  and GPUs would be able to run CUDA and BEDPOSTX ? Is it appropriate or should we look for some other configuration?  

Yes, it will work, but I do not recommend Quadro GPUs They are expensive and the performance is not so great as Tesla, specially in double precission. 
I prefer Tesla family, and if not possible, GeForce (not as good performance as Tesla but more affordable than Quadro).

 2) Do you think it is better to have 2 CPUs for and 2 GPUs for quickers image pre-processing? or there is as such no difference in the results when compared to ony one CPU and one GPU.

Each GPU needs one CPU core, and each of these processors has 10 cores?. So, one processor is enough, although you still can use the rest of CPU cores for running other tasks in parallel.
I am not sure about memory buses (for copying data CPU <-> GPU). Take a look to the configuration, PCIexpress, NVLINK...
In any case, bedpostx does not need to copy much data... so this is not a problem if these transfers happens sequentially.

3 & 4) Yes, UbuntuOS is ok for bedpostx_gpu, however, if you want to make things easier when installing FSL, use CentOS (used forFSL distribution)

Moises.

On 15 February 2018 at 10:22, Wu chen <[log in to unmask]> wrote:
Dear FSL Users and Experts,


We are thinking to order following configuration for CUDA implementation and I have some questions related to it.

First Processor

-Xeon E5-2630V4 2.2Ghz 20MB Turbo Boost (S26361-F5001-E220)

Second Processor

                -2ND XEON E5-2630V4 (S26361-F5002-E230)

RAM

-16 x 16 GB = 256GB RAM

HDD SATA III

                 -2TB x 2 = 4TB

First Graphics Card

                  -NVIDIA QUADRO M5000 8GB ( S26361-F2222-E593 )

Second Graphics Card

                 -NVIDIA QUADRO M5000 8GB (S26361-F2222-E990)




1) Do you think above mentioned CPUs  and GPUs would be able to run CUDA and BEDPOSTX ? Is it appropriate or should we look for some other configuration? Please suggest a configuration if you disagree with our configuration.

2) Do you think it is better to have 2 CPUs for and 2 GPUs for quickers image pre-processing? or there is as such no difference in the results when compared to ony one CPU and one GPU

3) Which CentoOS version and CUDA version would you suggest to install on the above configuration?

4) Can we run CUDA implementation on UbuntuOS?

I really appreciate your time and efforts for making our lab more faster and more powerful.

Many Thanks
Best, Wu