Hi,
A few of the reasons why it's not likely that we'll support glexec.
I'll try not to mention the security problems since I know the responce
by now "we'll have a security review".
* Since suid programs do not use LD_LIBRARY_PATH tarball installations
are now useless. We really do not want to go back to rpms and in some
clusters it is impossible anyway.
* One of the reasons that we run SGE is that the batch system will
automatically kill any left over processes after a job finishes,
obviously with glexec in the picture changing uids this functionality
is lost. Not nice.
* Our batch system creates temporary directories for each job (and deletes
them automatically after the job finishes) in a scratch area. Obviously
since the pilot jobs will be running as a different user no scratch area
will be available to them.
* The grid enviroment is loaded by the batch system for grid jobs only in
our cluster (and we aren't alone there) since we do not want to polute
normal jobs with the LCG environment and in some clusters it is impossilble.
This means that glexec either has to allow the driver job to pass environment
variables to the pilot job (with the obvious security implications) or that
the job will have no grid environment available.
* How on earth can we stop a user sumbitting a job to the CE as member of
VO1 and the using glexec to run the job (still using his own certificate)
as member of VO2 (assuming he is member of both)? Who is billed in
this case VO1 or VO2? Was the priority used by the batch system the
right one for the job that run?
* What are the memory/cpu requirements that the pilot job "driver" is
using? Now that we are finally about to get the jobs requirements send to
the CE we are moving to a model that ends up with each job asking for the
maximum. One step forward, two steps backwards.
* Since the pool account mapping is now done at the WN level you'll
have to share /etc/grid-security/gridmapdir at least on all WNs
(exported with no_root_squash and writable also). Has anyone thought
on how that affects security? Of course not.
* I am still waiting to find out what the magic incantation to send the
proxies to the pilot job "driver" securely is going to be. If you have
DNs A,B,C running pilot jobs for X,Y,Z and a job running under DN Z is
caught doing something illegal how on earth are you going to prove that
it wasn't A, B or C that stole the proxy?
Cheers,
Kostas
|