Hi - FEEDS turns off SGE so that the timings are valid for a single
processor. It's possible that you're running it on an SGE node which
would mean that the node tries to submit FEAT subjobs? This probably
doesn't work because the node isn't allowed to do SGE submission?
Cheers.
On 16 Nov 2007, at 22:14, G. Homola wrote:
>> My dirty hack would be to mess with SGE as such...
>
>>
>
>> $ cd /usr/sge
>
>>
>
>> $ mv util/arch util/arch.old
>
>>
>
>> $ echo "#! /bin/sh" > util/arch
>
>> $ echo "echo \"lx24-amd64\"" >> util/arch
>
>>
>
>> $ chmod +x util/arch
>
>>
>
>> Not pretty, Should work.
>
>
>
> To bypass the arch-script did help with the ./install_qmaster issue.
> Since fsl_sub demands it, I created four more queues (veryshort.q,
> short.q, long.q, verylong.q) and configured them like all.q.
> Everything seems to be fine, # qstat -f shows them all.
>
>
>
> Running FEEDS goes well until parallelization is needed by FEAT:
>
>
>
> # (time -p fsl-selftest -c) > benchmark.log 2>&1
>
>
>
> Create temporary working directory
>
> Using /tmp/fsl-selftest.Ti5743 as working directory
>
>
>
> FSL Evaluation and Example Data Suite v4.0
>
>
>
> start time = Fri Nov 16 19:40:01 CET 2007
>
> hostname = amd64
>
> os = Linux amd64 2.6.22-14-generic #1 SMP Sun Oct 14 21:45:15 GMT
> 2007 x86_64 GNU/Linux
>
>
>
> /bin/rm -rf /tmp/fsl-selftest.Ti5743/results ; mkdir /tmp/fsl-
> selftest.Ti5743/results
>
>
>
> Starting PRELUDE & FUGUE at Fri Nov 16 19:40:01 CET 2007
>
> % error = 0.0
>
> % error = 0.0
>
>
>
> Starting SUSAN at Fri Nov 16 19:40:03 CET 2007
>
> % error = 0.03
>
>
>
> Starting SIENAX (including testing BET and FLIRT and FAST) at Fri
> Nov 16 19:42:29 CET 2007
>
> checking error on BET:
>
> % error = 0.0
>
> checking error on FLIRT:
>
> % error = 0.0
>
> checking error on FAST:
>
> checking error on single-image binary segmentation:
>
> % error = 0.0
>
> checking error on partial volume images:
>
> % error = 0.0
>
> % error = 0.0
>
> % error = 0.0
>
> checking error on SIENAX volume outputs:
>
> % error = 0.0
>
> % error = 0.0
>
> % error = 0.0
>
> % error = 0.0
>
> % error = 0.0
>
>
>
> Starting BET2 at Fri Nov 16 19:50:41 CET 2007
>
> checking error on T1 brain extraction:
>
> % error = 0.0
>
> checking error on skull and scalp surfaces:
>
> % error = 0.0
>
> % error = 0.0
>
> % error = 0.0
>
>
>
> Starting FEAT at Fri Nov 16 19:58:27 CET 2007
>
> checking error on filtered functional data:
>
> No output image created
>
> Warning - test failed!
>
> checking error on raw Z stat images:
>
> No output image created
>
> Warning - test failed!
>
> No output image created
>
> Warning - test failed!
>
> No output image created
>
> Warning - test failed!
>
> checking error on thresholded Z stat images:
>
> No output image created
>
> Warning - test failed!
>
> No output image created
>
> Warning - test failed!
>
> No output image created
>
> Warning - test failed!
>
> checking error on position of largest cluster of Talairached zfstat1:
>
> couldn't open "/tmp/fsl-selftest.Ti5743/results/fmri.feat/
> cluster_zfstat1_std.txt": no such file or directory
>
> while executing
>
> "open ${FEEDSDIR}/results/fmri.feat/cluster_zfstat1_std.txt r "
>
> invoked from within
>
> "if { $feeds(feat) } {
>
>
>
> puts "\nStarting FEAT at [ exec date ]"
>
>
>
> # fix FEAT setup file to use FEEDSDIR and FSLDIR
>
> fsl:exec "cp ${FEEDSDIR}/data/fmri.fe..."
>
> (file "./RUN" line 285)
>
> Remove temporary directory '/tmp/fsl-selftest.Ti5743'
>
> Done.
>
> real 1107.45
>
> user 1080.87
>
> sys 24.21
>
>
>
>
>
>
>
> My QMaster log looks like this:
>
>
>
> 11/16/2007 19:33:34|qmaster|amd64|I|read job database with 0 entries
> in 0 seconds
>
> 11/16/2007 19:33:34|qmaster|amd64|I|qmaster hard descriptor limit is
> set to 8192
>
> 11/16/2007 19:33:34|qmaster|amd64|I|qmaster soft descriptor limit is
> set to 8192
>
> 11/16/2007 19:33:34|qmaster|amd64|I|qmaster will use max. 8172 file
> descriptors for communication
>
> 11/16/2007 19:33:34|qmaster|amd64|I|qmaster will accept max. 99
> dynamic event clients
>
> 11/16/2007 19:33:34|qmaster|amd64|I|starting up GE 6.1u2 (lx24-amd64)
>
> 11/16/2007 19:35:21|qmaster|amd64|E|commlib error: got read error
> (closing "amd64/qstat/2")
>
> 11/16/2007 19:58:38|qmaster|amd64|W|job 7.1 failed on host amd64
> general changing into working directory because: 11/16/2007 19:58:37
> [1000:7482]: error: can't chdir to /tmp/fsl-selftest.Ti5743/results/
> fmri.feat: No such file or directory
>
> 11/16/2007 19:58:38|qmaster|amd64|W|rescheduling job 7.1
>
>
>
>
>
>
>
> I guess instructions handover to SGE fails. How can I find out where
> and why it goes wrong?
>
>
>
> Many Thanks
> Georg
---------------------------------------------------------------------------
Stephen M. Smith, Professor of Biomedical Engineering
Associate Director, Oxford University FMRIB Centre
FMRIB, JR Hospital, Headington, Oxford OX3 9DU, UK
+44 (0) 1865 222726 (fax 222717)
[log in to unmask] http://www.fmrib.ox.ac.uk/~steve
---------------------------------------------------------------------------
|