> -----Original Message-----
> From: Testbed Support for GridPP member institutes [mailto:TB-
> [log in to unmask]] On Behalf Of Simon Fayer
> Sent: 17 March 2011 15:16
> To: [log in to unmask]
> Subject: glexec batch system interoperability
>
> Hi everyone,
>
> While doing some tests with the glexec "suexec" test program (
> http://www.nikhef.nl/grid/lcaslcmaps/glexec/osinterop ) I've noticed that
> it provokes some strange behaviour with SGE... Normally after a job
> terminates, all child processes are also killed (no matter how much a user
> tries to disown them). When using suexec, SGE seems to fail to kill the
> child process, leaving the process running on the node indefinately.
>
I have no direct SGE experience at all, however, according to:
http://www.sysadmin.hep.ac.uk/wiki/ProcessesOnBatchNodes
if you're using the ENABLE_ADDGRP_KILL parameter it adds a per-job
supplementary group ID to keep track of even daemon child processes,
so anything that doesn't preserve those (like glexec-ing) would
defeat it. You could probably get the same effect without glexec by
having your test script explicitly drop and supplementary group IDs
before forking.
Ewan
|