Hi,
On Fri, May 13, 2011 at 10:24:13AM -0700, Chuck Theobald wrote:
> Speaking of alternate grid technology, is there a plan to
> incorporate a Torque option in fsl_sub for the official release of
> FSL? I know that this had been added in the past on this list, but
> this option did not make it into the NeuroDebian repository or the
> FMRIB download. Given the fate of SGE, I think supporting Torque
> would be a good move.
I started investigating another option. IMHO it would make sense to have
FSL use an abstraction layer for parallel workflows, instead of tying it
to a particular system. One thing that complicates the matter is that
FSL needs to submit jobs with inter-dependencies.
I recently packaged cctools (http://nd.edu/~ccl/software/) for Debian
that offers "Makeflow". This is a tool that allows you to write a
simplified Makefile for you jobs and specify job and data dependencies
in it -- no need to wait for SGE to return a job id to be able to
specify a dependency. The best thing is that Makeflow can already submit
these workflows to SGE and Condor, but it can also parallelize them
locally WITHOUT any batch system -- just like Make.
I haven't had the time for a detailed look on how easy it would be to link
it into FSL, but I didn't see a show stopper on first glance.
Enhancing makeflow to support torque and other qsub-likes should be
fairly easy.
It might be worth mentioning that a system that knows about job AND data
dependencies is quite useful. One could use it to dispatch jobs on
systems where a particular target software or dataset is not available.
It can grab binaries and data and send it to the compute machine -- that
is essentially fine-grained cloud-computing.
Michael
PS: Makeflow is available in the coop-computing-tools package in
NeuroDebian and Debian sid right now. And it has other cool tools as
well.
--
Michael Hanke
http://mih.voxindeserto.de
|