I have FSL setup on a mixed-architecture (OS X and Linux) SGE cluster. We've pushed a good number of things through the cluster (e.g., with bedpostx) but are having issues with MELODIC. If the submit-host and the execution host are the same architecture, life is good. If they differ, though, we hit issues.
For example, if we submit from an OS X host (running 5.0.4) we'll get a log file like this:
cat feat1a_init.e10855
/usr/local/fsl/bin/feat: 74: exec: /usr/local/fsl/bin/fsltclsh: not found
Now, that job was run on a Linux execution host and, in the linux FSL, $FSLTCLSH isn't /usr/local/fsl/bin/fsltclsh but rather just /usr/bin/tclsh. Issues seem to be tied to feat here and to it somehow getting the env. variables from the submission host rather than from the execution host. Simply making symlinks from /usr/local/fsl/bin/fsltclsh to /usr/bin/tclsh brought up another error (not finding a library - again, likely from the env. variable differences).
Again, though, submitting the same script from a Linux host works just fine. Also, if I alter fsl_sub on the OS X machines to pass into the qsub call "-l arch=darwin-x86" and submit from a Mac, we go just fine as well. So, it's not that either platform can't run MELODIC via SGE. It just can't run if the platform (or perhaps install details) doesn't match the execution.
Any ideas on how to proceed?
Craig
|