Hi!
I have started to modify the fsl_sub for Slurm, but now it seems that I
need to modify also other instances due to different output of Slurm
functions.
Where in FSL the associations between sge tasks are processed. If I have
understood right the last line of earlier command ( at least with FEAT
logs) is treated as an ID of the process and this ID is then passed to
the next process to wait before start. The original FSL code returns now
the first word of the line which is not a number as Slurm has different
output.
In Slurm the last line of output after srun/sbatch (corresponds to qsub)
looks like this on command line:
srun: job 211205 has been allocated resources
Which generates output like this in Slurm environment (from FEAT logs):
/home/pajula2/FSL_test/fsl/bin/fsl_sub -T 10 -l logs -N feat0_init /home/pajula2/FSL_test/fsl/bin/feat /home/pajula2/tests/test1+++++++++++++++++.feat/design.fsf -D /home/pajula2/tests/test1+++++++++++++++++.feat -I 1 -init
SLURM
Starting Slurm submissions...
sge_command: srun --chdir=/home/pajula2/tests/test1+++++++++++++++++.feat -p sgn -J feat0_init --output=logs/log.o --error=logs/log.e -t 01:00:00 -N 1
executing: /home/pajula2/FSL_test/fsl/bin/feat /home/pajula2/tests/test1+++++++++++++++++.feat/design.fsf -D /home/pajula2/tests/test1+++++++++++++++++.feat -I 1 -init
srun: job 211193 queued and waiting for resources
srun: job 211193 has been allocated resources
/home/pajula2/FSL_test/fsl/bin/fsl_sub -T 30 -l logs -N feat1b_reg -j srun: /home/pajula2/FSL_test/fsl/bin/feat /home/pajula2/tests/test1+++++++++++++++++.feat/design.fsf -D /home/pajula2/tests/test1+++++++++++++++++.feat -I 1 -reg
SLURM
Starting Slurm submissions...
sge_command: srun --chdir=/home/pajula2/tests/test1+++++++++++++++++.feat -p sgn -J feat1b_reg --output=logs/log.o --error=logs/log.e --dependency=afterok:srun: -t 01:00:00 -N 1
executing: /home/pajula2/FSL_test/fsl/bin/feat /home/pajula2/tests/test1+++++++++++++++++.feat/design.fsf -D /home/pajula2/tests/test1+++++++++++++++++.feat -I 1 -reg
srun: error: Unable to allocate resources: Job dependency problem
/home/pajula2/FSL_test/fsl/bin/fsl_sub -T 8 -l logs -N feat2_pre -j srun::srun: /home/pajula2/FSL_test/fsl/bin/feat /home/pajula2/tests/test1+++++++++++++++++.feat/design.fsf -D /home/pajula2/tests/test1+++++++++++++++++.feat -I 1 -prestats
SLURM
Starting Slurm submissions...
sge_command: srun --chdir=/home/pajula2/tests/test1+++++++++++++++++.feat -p sgn -J feat2_pre --output=logs/log.o --error=logs/log.e --dependency=afterok:srun::srun: -t 01:00:00 -N 1
executing: /home/pajula2/FSL_test/fsl/bin/feat /home/pajula2/tests/test1+++++++++++++++++.feat/design.fsf -D /home/pajula2/tests/test1+++++++++++++++++.feat -I 1 -prestats
srun: error: Unable to allocate resources: Job dependency problem
Where I can found this ID processing phase?
I found out already where the dot is set between different IDs (here
replaced by colon), but I am missing the source of IDs. Are the ID's
collected originally from the output of qsub command or from some variable?
--
Juha Pajula,
Researcher, Ph.D. Student,
Methods and Models for Biological Signals and Images group of Signal
Processing department in Tampere University of Technology,
Finland
On 03/23/13 09:53, Mark Jenkinson wrote:
> Hi,
>
> We don't have any experience using Slurm in Oxford but maybe someone else on the list does.
> As for CUDA, the current release doesn't support this but we do have CUDA code running in-house at the moment and are intending to release this quite soon.
>
> All the best,
> Mark
>
>
> On 22 Mar 2013, at 07:52, Juha Pajula <[log in to unmask]> wrote:
>
>> Hi!
>>
>> Our university set up a new computing cluster recently and in this new
>> system the parallel resource management is based on Slurm
>> (https://computing.llnl.gov/linux/slurm/)
>>
>> I am currently setting up the FSL to the cluster and it seems to work
>> now fine in a single node. For the real analysis work I need the
>> parallel abilities of FLS and for this reason I have to modify the
>> fsl_sub for the slurm environment.
>>
>> Do you have any experience how to modify the fsl_sub for Slurm? I didn't
>> found any notes about Slurm from FSL webpage.
>>
>>
>> As a minor question: Does FSL support CUDA computations?
>>
>>
>>
>> --
>> Juha Pajula,
>> Researcher, Ph.D. Student,
>> Methods and Models for Biological Signals and Images group of Signal
>> Processing department in Tampere University of Technology,
>> Finland
|