Print

Print


Hi Lokke,

On 4 Oct 2007, at 22:11, Lokke Highstein wrote:

> one thing is that the paths to the data we are working on are  
> slightly different on the head nodes and the cluster nodes, due to  
> the head node having the data mounted off of an xraid (and showing  
> up in /Volumes, which i then made symbolic links to the /  
> directory) and the nodes mounting the same directories at /nfs/  
> with symbolic links to /

Could you confirm:

On the head node /data_directory is a link to /Volumes/data_directory
On compute node /data_directory is a link to /nfs/data_directory

> i have set up sge_aliases which are supposed to solve this, but  
> still it seems that when we submit a bedpost job through the GUI  
> (which forces the full path to be used - even if we type in the / 
> data_directory path it then resolves the actual full path) it ends  
> up in an error state in the queue.

Could you send me the output of: 'cat ${SGE_ROOT}/default/common/ 
sge_aliases'
Also an example of the full error message (try 'qstat -j <jobid> -xml  
| grep -i QIM_message')

Assuming the links are as above, then I think sge_aliases should  
contain "/Volumes/data_directory * * /data_directory", however, we re- 
engineered our setup to avoid these problems, i.e., our /Volumes/Data  
is mounted as /Volumes/Data everywhere, so I'm not entirely certain  
of this.

> when we submit the job via command line with the full path it  
> fails.  when we submit the job with the /data_directory path it works.
>
> i also want to run FEEDS on this to get more info, but it doesn't  
> seem to submit the job to the cluster and only runs on the head  
> node.  is there a way to submit the FEEDS job to the cluster?

The FEEDS script unsets SGE_ROOT at the start. Just comment this  
liine out and replace it with 'exec $FSLTCLSH :$0" "$@"'. It should  
then run the SGE aware subsections using available SGE slots. Note  
that some scripts will then exit immediately leaving just their  
processing tasks on the queues. To work out how long everything takes  
you'll need to manually add up all the runtimes (use qacct) and/or  
observe the wall-clock time data.

--
Cheers, Dave

Dave Flitney, IT Manager
Oxford Centre for Functional MRI of the Brain
E:[log in to unmask] W:+44-1865-222713 F:+44-1865-222717
URL: http://www.fmrib.ox.ac.uk/~flitney