Hi
As you said, the way jobs are assigned to cluster nodes has to do with your cluster job scheduler and your system administrator.
Cheers
Stam
On 11 May 2015, at 08:08, Estephan Moana <[log in to unmask]> wrote:
> I'm using bedpostx_gpu in a computing cluster where each node has 2 GPUs available (the cluster does not use SGE). I have followed the recommendations listed here: <http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FslInstallation#Running_bedpostX_on_a_GPU_or_GPU_cluster>. I submitted a job using a PBS script so that each computing node process 2 bedpostx_gpu commands, and I assumed that each command would be assigned to one GPU. However, I'm under the impression that both bedpostx_gpu commands are being ran within only one of the 2 GPUs available to a particular node, based on the time it takes to process compared to a single command. In fact, if I submit 4 bedpostx_gpu commands to 2 nodes (x2 GPUs) it seems that all four commands are ran with a single GPU - which is odd since my script called for 2 computing nodes.
>
> Here it is one example PBS script I used to submit 2 bedpostx_gpu to the cluster:
>
> #!/bin/bash -l
> #PBS -l nodes=1:ppn=24:gpus=2,walltime=5:00:00
> #PBS -m abe
> #PBS -M [log in to unmask]
> #PBS -q k40
> module load cuda/5.5
> module load fsl/5.0.8
> export CUDA_VISIBLE_DEVICES=0,1
> bedpostx_gpu /home/moanae/moanae/project_HCP/subjs/102311_20000101/T1w/Diffusion -n 3 -model 2 -g --rician --cnonlinear 1> /home/moanae/moanae/project_HCP/subjs/102311_20000101/T1w/Diffusion/LOGFILE_102311_20000101_bedpostx_stdout.txt 2> /home/moanae/moanae/project_HCP/subjs/102311_20000101/T1w/Diffusion/LOGFILE_102311_20000101_bedpostx_stderr.txt &
> bedpostx_gpu /home/moanae/moanae/project_HCP/subjs/102816_20000101/T1w/Diffusion -n 3 -model 2 -g --rician --cnonlinear 1> /home/moanae/moanae/project_HCP/subjs/102816_20000101/T1w/Diffusion/LOGFILE_102816_20000101_bedpostx_stdout.txt 2> /home/moanae/moanae/project_HCP/subjs/102816_20000101/T1w/Diffusion/LOGFILE_102816_20000101_bedpostx_stderr.txt &
> wait
>
> My question is: the task of assigning bedpostx_gpu to a particular GPU within a computing node relies on the cluster job scheduler, or is it part of the bedpostx_gpu code?
>
> Any suggestions on how to get bedpostx_gpu to run separately in each of the 2 GPUs available to a node would be greatly appreciated. Thank you.
>
> Estephan
|