Hi John,
It's not just an issue for SGE, we use torque but have it create a temporary directory for each job which it then starts the job in. That directory would be owner by the pilot user.
We don't have NFS home areas, but the home disk on our worker nodes is small compared to the scratch area jobs usually start in.
Yours,
Chris
> -----Original Message-----
> From: Testbed Support for GridPP member institutes [mailto:TB-
> [log in to unmask]] On Behalf Of John Gordon
> Sent: 05 April 2011 06:31
> To: [log in to unmask]
> Subject: Re: glexec working directories
>
> Simon/Andy/anyone, could I please have a slide(s) summarising the SGE
> glexec issues for the GDB on Wednesday? I'll put it on the agenda
> during the MUPJ discussion and I or the UK T2 person attending can
> speak to it.
>
> Thanks,
>
> John
>
> -----Original Message-----
> From: Testbed Support for GridPP member institutes [mailto:TB-
> [log in to unmask]] On Behalf Of Andrew Washbrook
> Sent: 04 April 2011 11:27
> To: [log in to unmask]
> Subject: Re: glexec working directories
>
> Hi Simon,
>
> On 1 Apr 2011, at 11:48, Simon Fayer wrote:
>
> > Hi all,
> >
> > I've been doing further tests with glexec and found another potential
> > problem... When a job arrives, SGE makes a directory for it on the
> > local disk, which is set as $TMP and the cwd at job-startup. After a
> > glexec switch, the job can no-longer write into this directory (as
> > it's owned by the pilot account) and cwd is set to the target user's
> > home directory (which in our case is on NFS and not large enough to
> > stage data files to).
> >
>
> Glad you performed this test. We have exactly the same setup and I
> thought this may be a potential issue from the outset.
>
> > A job could potentially make its own working directory in the right
> > place. Ideally this would be called in some kind of prologue script
> or
> > similar immediately after a glexec switch, although I don't think a
> > feature to do this exists.
> >
>
> This could be a workaround however it is one we could not use. We do
> not have access to the SGE prolog and epilog scripts and after
> discussions with our system admins it was considered putting in grid-
> only hacks/tweaks on a shared cluster queue was a bad thing to do. We
> can patch around this for Cream CE requirements, but for glexec this
> would be very hard to implement.
>
> Thanks,
> Andy.
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
|