hi davis,
thanks for the patch, but is this sufficient to make the job write it's
output to that directory? and the cleaning afterwards?
and how do you make the $TMPDIR value unique? (and how do you actually
create this unique directory on the fly?)
(i probably missed something...)
stijn
David Groep wrote:
> Hi Stijn,
>
> Using the fact that TMPDIR points to a local (per-node) scratch directory,
> we have been applying a patch to the original pbs manager to select,
> based on the #nodfes requested by the job and the job type, the
> current working directory of the job. Patch is attached to this mail
> (it's just a few lines).
>
> What it does:
> * if the job is of type "mpi", or if the type is "multiple" and the
> number of requested nodes > 1, the bevahviour of the pbs job manager
> is un-altered.
> * if the job type is "single", or the type is "multiple" and the
> job requests 0 or 1 nodes, the following statement is inserted
> in the PBS job script, just before the user job is started:
>
> [ x"$TMPDIR" != x"" ] && cd $TMPDIR
>
> This patch is applied to the template for the pbs.pm job manager script
> in /opt/globus/setup/globus/pbs.in, which then gets translated on
> startup in /opt/globus/lib/perl/Globus/GRAM/JobManager/pbs.pm
>
> It has till now worked fine for all LCG jobs that at NIKHEF also go
> through the "old" pbs JM. The jobs don't notice the difference and
> we can use shared home dirs for all VOs, provided we also have a
> per-node $TMPDIR location on local disk.
>
> Cheers,
> DavidG.
>
> Stijn De Weirdt wrote:
>
>> i am looking for a way to set the SCRATCH_DIRECTORY value for jobs
>> running on our cluster based on the vo.
>> (we have pbs queues with nfs mounted home directories for mpi, but we
>> also have jobs that might perform better when they can write directly
>> to local disks.)
>>
>> there's an option for the globus-job-manager called -scratch-dir-base,
>> but i don't know if that's the good way to change this. (and i also
>> don't know how to make it vo dependent)
>>
>> or if someone has succesfully mixed pbs (nfs) and lcgpbs (local disk)
>> queues on their site, that could also be a solution.
>>
>> many thanks
>>
>> stijn
>
>
>
> The patch
>
> *** pbs.in.orig 2005-05-20 12:56:32.000000000 +0200
> --- pbs.in 2005-05-20 12:52:05.000000000 +0200
> ***************
> *** 321,327 ****
> }
> print JOB "wait\n";
> }
> ! elsif($description->jobtype() eq 'multiple')
> {
> my $count = $description->count;
> my $cmd_script_url ;
> --- 321,327 ----
> }
> print JOB "wait\n";
> }
> ! elsif( ($description->jobtype() eq 'multiple') and
> ($description->count > 1 ) )
> {
> my $count = $description->count;
> my $cmd_script_url ;
> ***************
> *** 374,379 ****
> --- 374,393 ----
> }
> else
> {
> + # this is a simple single-node job that can use $TMPDIR
> + # unless the user has given one explicitly
> + # refer back to JobManager.pm, but currently it seems that
> + # $self->make_scratchdir uses "gram_scratch_" as a component
> + if ( ( $description->directory() =~ /.*gram_scratch_.*/ ) and
> + ( $description->host_count() <= 1 ) and
> + ( $description->count <= 1 )
> + ) {
> + print JOB '# user ended in a scratch directory, reset to
> TMPDIR'."\n";
> + print JOB '[ x"$TMPDIR" != x"" ] && cd $TMPDIR'."\n";
> + } else {
> + print JOB '# user requested this specific directory'."\n";
> + }
> +
> print JOB $description->executable(), " $args <",
> $description->stdin(), "\n";
> }
>
>
|