Print

Print


Alex and Michael,

Thanks for your help and suggestions. 

The update of the FLS package seemed to do the trick. I updated today.

Also, i received the same error message talked about in this link:

https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1301&L=fsl&P=R107838&1=fsl&9=A&J=on&d=No+Match%3BMatch%3BMatches&z=4

 Unable to run job: Colon (':') not allowed in objectname.

I followed the suggestion of commenting out the following line 373 in fsl_sub:

echo "default queue: \"$queue\"" 

Now, running tbss_2_reg and bedpostx works.

Thanks,again

D. Michael


From: Michael Hanke <[log in to unmask]>
To: [log in to unmask]
Sent: Sunday, January 20, 2013 7:32 AM
Subject: Re: [FSL] Parallel processing SGE and Ubuntu

Hey,
Note that I am currently build a FSL package update that addresses an issue that could be the cause for this behavior. The update should become available tomorrow.
Michael
On Jan 19, 2013 8:46 PM, "D D" <[log in to unmask]> wrote:
Alex,

Thanks. Okay, I tried your suggestions with tbss_2_reg and I get this error message:

Unable to read script file because of error: error opening dm@localhost: No such file or directory

I then tried to run bedpostx on a data set. 

I get he following lines:

switching from /data/transfer/preprocess/scrap to /data/transfer/preprocess
bedpostx /data/transfer/preprocess/scrap --nf=2 --fudge=1  --bi=1000
/usr/share/fsl/5.0/bin/bedpostx /data/transfer/preprocess/scrap --nf=2 --fudge=1  --bi=1000
subjectdir is /data/transfer/preprocess/scrap
Making bedpostx directory structure
Queuing preprocessing stages
Queuing parallel processing stage
Queuing post processing stage

Type /data/transfer/preprocess/scrap.bedpostX/monitor to show progress.
Type /data/transfer/preprocess/scrap.bedpostX/cancel to terminate all the queued tasks.

You will get an email at the end of the post-processing stage.


And then a pop-up window appears stating:

Errors: Unable to run job: At ('@')not allowed in objectname
Job was rejected because job requests  unknown queue "-M".
Exiting.
Errors: Unable to run job: At ('@')not allowed in objectname
Job was rejected because job requests  unknown queue "-M".
Exiting.
Errors: Unable to run job: At ('@')not allowed in objectname
Job was rejected because job requests  unknown queue "-M".
Exiting.


So, I'm getting there, but still getting these error message when trying to run a program that can use parrallel processing.

D.Michael

From: "[log in to unmask]" <[log in to unmask]>
To: [log in to unmask]
Sent: Saturday, January 19, 2013 5:51 AM
Subject: Re: [FSL] Parallel processing SGE and Ubuntu

Hi D. Michael,

Thanks for the details on how you did it. I didn't know condor wasn't working on 10.04.

Sorry, I might be confusing you.
Some scripts in FSL already uses fsl_sub to submit different processes to a queue then you don't need to execute it with fsl_sub. But that is not the case of tbss_1_preproc.
So, if you want one tbss_1_preproc execution to submit many processes to the queue you would have to modify it to do so.

Looking at the TBSS manual, there is an indication that tbss_2_reg does use fsl_sub, so if you want to see your SGE working, you might want to finish tbss_1_preproc and then use tbss_2_reg.

In some cases, mostly when processing many images (in batch), you can execute each process with fsl_sub to use SGE, but you would have to program the loop or the batch script.

I hope this helps.

Cheers,
Alex

Alexandre Manhães Savio <[log in to unmask]>
Grupo de Inteligencia Computacional
Departamento de CCIA
UPV/EHU


On 18 January 2013 20:59, D. Michael <[log in to unmask]> wrote:
Alex,

Thanks for the feedback.

I tried installing condor,but have been having issues with Ubuntu 10.04. Upgrading to 12.04 would be something we'd like to avoid until down the line.

I've been trying to use SGE as an alternative.

Thanks for the link below about SGE and FSL. I've seen some other recent stuff where I would dowload sgengine and associated applications directly. So, that is what I did.

I also came across this link in the fsl forums:

https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1102&L=FSL&D=0&1=FSL&9=A&J=on&d=No+Match%3BMatch%3BMatches&z=4&P=42344


I looked into the how to configure from the  /usr/share/doc/gridengine-common/README.Debian readme file.

I've done the following steps:

 + sudo -u sgeadmin qconf -am myuser

 * and to a userlist:
   + qconf -au myuser users

 * Add a submission host:
   + qconf -as myhost.mydomain

 * Add an execution host:
   + qconf -ae
   You will now be prompted for information about the execution host.

 * Add a new host group:
   + qconf -ahgrp @allhosts

 * Add the exec host to the @allhosts list:
   + qconf -aattr hostgroup hostlist myhost.mydomain @allhosts

 * Add a queue:
   + qconf -aq main.q

 * Add the host group to the queue:
   + qconf -aattr queue hostlist @allhosts main.q

 * Make sure there is a slot allocated to the execd:
   + qconf -aattr queue slots "[myhost.mydomain=1]" main.q

 * Running qstat -f should then show you the execd waiting for jobs

I typed export FSLPARALLEL=1 in a given shell, and tried to run an "fsl_sub tbss_1_preproc". I'm able to get a jobs list from qstat -f

We want to spread out the job onto a select amount of processors. It seems when I type "top", my tbss_1_preproc command is still using one processor at 100%. With parallel processing, shouldn't it spread out to multiple cpu's? Given the link above, i tried the command

qconf -aattr queue slots "[myhost.mydomain=1]" --> "[myhost.mydomain=6]" with my configuration, to use processors 1 through 6.  Is this correct? Am I on the right track?

Thanks,

D. Michael


On Fri, 18 Jan 2013 16:05:24 +0100, [log in to unmask] <[log in to unmask]> wrote:

>Hi D. Michael,
>
>1.
>I'm not sure if you have to set the line as: export FSLPARALLEL=1
>Is your SGE installation working? I guess you would need to configure it
>first either for it to run on your own machine correctly as to run within
>FSL "needs".
>Please, check this:
>https://www.fmrib.ox.ac.uk/phpwiki/index.php/FslSge
>
>2.
>To submit a job to a SGE queue you would have to run it with fsl_sub as:
>fsl_sub bedpost <bestpost_args>
>
>Please, check fsl_sub help for more arguments.
>
>3.
>If any of this doesn't work:
>If you are using the Neurodebian repository, I would recommend you to use
>condor, instead of SGE.
>In this mailing list there are some details on how to install and use it,
>or you might want to check this first:
>http://neuro.debian.net/blog/2012/2012-03-09_parallelize_fsl_with_condor.html
>
>In summary:
>sudo apt-get install condor. Choose automatic, personal installation (say
>yes to everything it asks during installation).
>
>Then set the line in fsl.sh to: export FSLPARALLEL=condor.
>
>If you have any other problem with this, please ask. I'm not sure if a
>change to fsl_sub is needed.
>
>With the Neurodebian Condor configuration I found the easiest way to have
>an execution queue engine.
>
>Cheers,
>Alex
>
>Alexandre Manh�es Savio <[log in to unmask]>
>Grupo de Inteligencia Computacional <http://www.ehu.es/ccwintco>
>Departamento de CCIA
>UPV/EHU
>
>
>On 18 January 2013 12:29, D. Michael <[log in to unmask]> wrote:
>
>> Hello,
>>
>> I'm trying to setup parallel processing using FSL on my 10.04 Ubuntu
>> operating system, and have been searching around to figure out how to set
>> this up.
>>
>> I've installed gridengine-master gridengine-exec gridengine-client
>> gridengine-qmon.
>>
>> I also uncommented in my /etc/fsl/5.0/fsl.sh file, the line
>> "FSLPARALLEL=1".
>>
>> Are there more steps involved?
>>
>> Say if, I want to run a command line tbss_1_preproc or bedbpost, what
>> would be the command line to run it in parallel. I have a single machine
>> with 16 processors available.
>>
>> Thanks,
>>
>> D. Michael
>>
>