For anyone not reading the Rollout list ...
-----Original Message-----
From: LHC Computer Grid - Rollout
[mailto:[log in to unmask]] On Behalf Of Markus Schulz
Sent: 14 June 2006 13:36
To: [log in to unmask]
Subject: [LCG-ROLLOUT] Pilot jobs site survey
Dear Site and ROC managers,
since quite a while pilot jobs have been around and sites and users
have expressed various levels of
discomfort regarding accounting, traceability, and security with the
way these jobs currently are handled.
On the other hand they have proven to be an extremely useful concept.
In the TCG an active discussion is taking place that should lead to a
more satisfying solution.
To get a better understanding of the view of the sites, which by
nature, is diverse, we would like to ask you to respond to the
following survey.
Assessment of the situation we are in:
==============================
Pilot jobs are sent to sites and start pulling other jobs from VO
operated queues when they run on the worker nodes. This has several
advantages for the VO. At any given time the VO knows how many job
slots are available to their users and they can manage priorities
within their VO independently of the sites.
This comes at a price. Either the pilots are submitted as one user
and all the jobs run on a site appear to be this users jobs, while in
fact other individuals workloads are run. Or in a slight variation of
the model, pilots are submitted for several users using the correct
identity, but requiring a very large number of pilot jobs out of
which not all will be used. While this is not wasting CPU, it is
putting some load on units like the CE, RB, and the local batch systems.
For us it is important to understand the following:
----------------------------------------------------------------------
1) As a site, do you see a problem with pilot jobs being submitted by
a production user and then running workloads for members of that VO,
as long as the VO accounting is correct?
2) As a site, is it sufficient for you that the pilot job framework
registers the DNs of users for which payloads are run with a service
on your site (this could be a log file, a service on the CE etc.).
This would mean that the unix user stays identical, but assuming
that you trust a VO's pilot framework you would be able to correlate
users activities on your site with their DNs, including the option to
block individual users. Accounting would be by VO, but could be
later disentangled to the user level.
3) As a site, is it required that in addition to the scenario
described in 2) the local user id changes whenever the pilot
framework launches a program of another user? This would ease using
process logs, but still requires trust in the VO pilot.
4) As a site, would you have a problem to achieve 3) with setuid or
sudo code being run on your WNs by the VO pilots?
5) As a site, do you worry, based on first principles of good system
administration, about pilot jobs and consider not to accept them?
(It would be helpful if you could briefly describe under which
conditions they would become acceptable)
Please sent your answer to: [log in to unmask]
---------------------------------------
Thanks for your cooperation
Alessandra and Markus
p.s. please, please, please do not start another round of discussions
(before the survey is over)
p.p.s. DO NOT REPLY TO THE MAIL BUT SEND THE ANSWERS TO
ALESSANDRA!!!!!!!
|