I just completed a port of fsl_sub for XGrid. Make sure you back up your current fsl_sub
before using this. This fsl_sub won't work local nor will it work with SGE. I simply don't
have the time to implement all 3 and make all the different options work without bugs.
You could rename this to fsl_sub_xgrid and symlink it or however you want to manage it.
You'll also notice that I use Kerberos for submitting jobs. This is done so that the job
doesn't run as nobody but as the user that initializes the job (there are other ways to do
this, read the XGrid manual for details on password-based authentication and ssh). This
requires OpenDirectory and Kerberos to be set up and working and the user might have
to re-authenticate before submitting jobs by using kinit (which gives you a Kerberos
ticket).
This script in my implementation works across 38 processors for a
combined 145.24 GHz of processing power. Job submitting works, job waiting works, you
can implement mail as well (I had to disable it because it causes a lot of e-mails to the
submitter)
The major hurdle this modified script overcomes is the FSL environment variables. In SGE
you can specify environment variables to be loaded while XGrid doesn't (easily) I have
solved it by using a temporary wrapper script that loads the FSL variable script from a
central location and submitting that wrapper script to the cluster. XGrid by default doesn't
allow /etc and /var to be accessed by the XGrid Agent (for obvious reasons) and if not
using Kerberos, it will run as "nobody" which can't access NFS (in my environment) so
make sure you adapt your configuration for it if you decide to use it.
As always, use this script on your own recognizance, I can not guarantee it works at all
nor that it works well. It might not fit your environment, it might give wrong results.
Please let me know if you have any bugs or expansions.
|