Hello,
I installed the scripts, but they don't seem to help(I submit the jobs
using the MPICH jobtype). I added some lines to the pbs.pm script, and
the directory looks OK(/home/betest001//gram_scratch_VThCRN5Ajx). I saw
that the pbs jobtype used for this kind of jobs is 'single', so it
starts a single executable with a number of CPU's reserved for it.
If this is correct, then the Jobwrapper created at the resource broker
is run on the first node. This jobwrapper has some lines like this
included(almost at the start of the script after the function definitions):
if [ ! -z "$EDG_WL_SCRATCH" ]; then
cd $EDG_WL_SCRATCH
cleanupDir
fi
newdir="https_3a_2f_2fgridrb.atlantis.ugent.be_3a9000_2frpFW4r8NnYsI4VjjIG41NQ"
mkdir -p ".mpi/"${newdir}
So the directory my MPI job ends up in, is not ~/.... but /scratch/.mpi/...
Regards,
Stijn
Antun Balaz wrote:
> Hi Stijn,
>
> You need to replace /opt/globus/lib/perl/Globus/GRAM/JobManager/pbs.pm with
> the attached pbs.pm (of course, do the diff!). In order to avoid changing back
> to the old version of pbs.pm after CE reconfiguration, you may also want to
> replace /opt/globus/setup/globus/pbs.in with the attached. This pbs.pm will
> put all jobs that do not use more than 1 CPU to $EDG_WL_SCRATCH, so be sure to
> define it on all WNs! MPI jobs will stay in homes of pool accounts.
>
> I would also suggest that you replace
> /opt/globus/lib/perl/Globus/GRAM/JobManager.pm with the attached
> JobManager.pm, so that you avoid possible problems with proxy renewal (if
> /home is shared over NFS, sometimes it happens that the renewed proxy,
> although availble, is not seen on WNs).
>
> Hope this helps,
> Antun
>
>
> -----
> Antun Balaz
> Research Assistant
> E-mail: [log in to unmask]
> Web: http://scl.phy.bg.ac.yu/
>
> Phone: +381 11 3160260, Ext. 152
> Fax: +381 11 3162190
>
> Scientific Computing Laboratory
> Institute of Physics, Belgrade, Serbia
> -----
>
> ---------- Original Message -----------
> From: Stijn De Smet <[log in to unmask]>
> To: [log in to unmask]
> Sent: Fri, 20 Apr 2007 08:45:58 +0200
> Subject: [LCG-ROLLOUT] MPI and EDG_WL_SCRATCH
>
>> Hello,
>>
>> I recently configured MPI support on my nodes, but when I try to use
>> it, it always fails because even MPI jobs get started in the
>> EDG_WL_SCRATCH directory, which isn't shared, while my homedirs are.
>> My nodes are configured using YAIM, but for the moment, I don't use
>> the yaim mpi configuration. Is there an easy solution for disabling
>> the SCRATCH directory for MPI jobs, or do I just have to disable
>> scratch space completely?
>>
>> Regards,
>> Stijn
> ------- End of Original Message -------
|