On Thu, Oct 27, 2005 at 04:45:26PM +0200 or thereabouts, David Groep wrote:
> Hi Stijn,
>
> Stijn De Weirdt wrote:
> >thanks for the patch, but is this sufficient to make the job write it's
> >output to that directory? and the cleaning afterwards?
> >and how do you make the $TMPDIR value unique? (and how do you actually
> >create this unique directory on the fly?)
>
> This patch goes together with the "transient-tmpdir-patch" for Torque
> (it is finally being integrated in the mainstraim Torque as well, thanks
> to SteveT).
>
> The Torque RPMs from SteveT are documented at:
> http://www.gridpp.ac.uk/tb-support/faq/torque.html
> and set "$tmpdir" in the mom config to "/tmp" or another local disk.
>
> Steve, others: is the transient-tmpdir patch also integrated in the default
> LCG Torque RPMs? (just not sure, I'm still running with my home-grown RPMs)
Yes the patches are both in the ones that LCG provides and the newer ones
at http://hepunx.rl.ac.uk/~traylens/rpms/torque .../maui. The newer ones
have had less than a day of testing so be warned.
I really must go through those FAQs and delete the old junk and move valid
things to better homes. e.g. there is another page which has modern info
and pointers to the same package. http://wiki.gridpp.ac.uk/wiki/Torque
Most of what is in the FAQ is now default anyway so not needed.
Steve
>
> Cheers,
> DavidG.
>
> >
> >(i probably missed something...)
> >
> >stijn
> >
> >David Groep wrote:
> >
> >>Hi Stijn,
> >>
> >>Using the fact that TMPDIR points to a local (per-node) scratch
> >>directory,
> >>we have been applying a patch to the original pbs manager to select,
> >>based on the #nodfes requested by the job and the job type, the
> >>current working directory of the job. Patch is attached to this mail
> >>(it's just a few lines).
> >>
> >>What it does:
> >>* if the job is of type "mpi", or if the type is "multiple" and the
> >> number of requested nodes > 1, the bevahviour of the pbs job manager
> >> is un-altered.
> >>* if the job type is "single", or the type is "multiple" and the
> >> job requests 0 or 1 nodes, the following statement is inserted
> >> in the PBS job script, just before the user job is started:
> >>
> >> [ x"$TMPDIR" != x"" ] && cd $TMPDIR
> >>
> >>This patch is applied to the template for the pbs.pm job manager script
> >>in /opt/globus/setup/globus/pbs.in, which then gets translated on
> >>startup in /opt/globus/lib/perl/Globus/GRAM/JobManager/pbs.pm
> >>
> >>It has till now worked fine for all LCG jobs that at NIKHEF also go
> >>through the "old" pbs JM. The jobs don't notice the difference and
> >>we can use shared home dirs for all VOs, provided we also have a
> >>per-node $TMPDIR location on local disk.
> >>
> >> Cheers,
> >> DavidG.
> >>
> >>Stijn De Weirdt wrote:
> >>
> >>>i am looking for a way to set the SCRATCH_DIRECTORY value for jobs
> >>>running on our cluster based on the vo.
> >>>(we have pbs queues with nfs mounted home directories for mpi, but we
> >>>also have jobs that might perform better when they can write directly
> >>>to local disks.)
> >>>
> >>>there's an option for the globus-job-manager called
> >>>-scratch-dir-base, but i don't know if that's the good way to change
> >>>this. (and i also don't know how to make it vo dependent)
> >>>
> >>>or if someone has succesfully mixed pbs (nfs) and lcgpbs (local disk)
> >>>queues on their site, that could also be a solution.
> >>>
> >>>many thanks
> >>>
> >>>stijn
> >>
> >>
> >>
> >>
> >>The patch
> >>
> >>*** pbs.in.orig 2005-05-20 12:56:32.000000000 +0200
> >>--- pbs.in 2005-05-20 12:52:05.000000000 +0200
> >>***************
> >>*** 321,327 ****
> >> }
> >> print JOB "wait\n";
> >> }
> >>! elsif($description->jobtype() eq 'multiple')
> >> {
> >> my $count = $description->count;
> >> my $cmd_script_url ;
> >>--- 321,327 ----
> >> }
> >> print JOB "wait\n";
> >> }
> >>! elsif( ($description->jobtype() eq 'multiple') and
> >>($description->count > 1 ) )
> >> {
> >> my $count = $description->count;
> >> my $cmd_script_url ;
> >>***************
> >>*** 374,379 ****
> >>--- 374,393 ----
> >> }
> >> else
> >> {
> >>+ # this is a simple single-node job that can use $TMPDIR
> >>+ # unless the user has given one explicitly
> >>+ # refer back to JobManager.pm, but currently it seems that
> >>+ # $self->make_scratchdir uses "gram_scratch_" as a component
> >>+ if ( ( $description->directory() =~ /.*gram_scratch_.*/ ) and
> >>+ ( $description->host_count() <= 1 ) and
> >>+ ( $description->count <= 1 )
> >>+ ) {
> >>+ print JOB '# user ended in a scratch directory, reset to
> >>TMPDIR'."\n";
> >>+ print JOB '[ x"$TMPDIR" != x"" ] && cd $TMPDIR'."\n";
> >>+ } else {
> >>+ print JOB '# user requested this specific directory'."\n";
> >>+ }
> >>+
> >> print JOB $description->executable(), " $args <",
> >> $description->stdin(), "\n";
> >> }
> >>
> >>
>
>
> --
> David Groep
>
> ** National Institute for Nuclear and High Energy Physics, PDP/Grid group **
> ** Room: H1.56 Phone: +31 20 5922179, PObox 41882, NL-1009DB Amsterdam NL **
--
Steve Traylen
[log in to unmask]
http://www.gridpp.ac.uk/
|