> -----Original Message-----
> From: Testbed Support for GridPP member institutes [mailto:TB-
> [log in to unmask]] On Behalf Of Daniela Bauer
> Sent: 06 September 2011 10:30
> To: [log in to unmask]
> Subject: Re: GridPP operations meeting at 11am today
>
>
> I've attached the file. It basically contains a statement that Atlas does
> not want to use gLexec plus their reasoning.
>
Well do want them to use it and here's my reasoning:
- Contrary to what ATLAS claim, the panda mechanism does not
provide useful traceability. As we recently found in an incident
at Oxford in which a set of badly written user analysis job
filled a worker node filesystem. The resulting files were owned
by the generic pilot account, with no straightforward means to
tie them to a particular user. Furthermore, even when we've
managed to trace trouble to a specific batch job, PANDA has
given us mappings to several different users' analysis payloads.
To find out who's responsible for any misbehavior requires an
essentially statistical approach of querying a lot of suspect
jobs and seeing who's name comes up most often.
- Panda does not provide useful user banning. In the above incident
I could have recovered the site by banning the badly behaved user;
given that site-admins don't have access to panda's user banning
feature, and given that ATLAS have proven unwilling to use it on
a site's behalf in the past, we have no way to deal with such
problems. The ATLAS paper says:
"Any Grid site should ban the ATLAS pilot DN or if necessary
the entire ATLAS VO in case there is a suspicion of compromised
credentials or illegal usage of resources at the site."
However, we have in fact done this in the past under similar
circumstances and the response from ATLAS was not positive to say
the least. Besides which, we don't want to do this - we're trying
to run an ATLAS service here, after all. In practice this proposal
is simply unrealistic - banning all analysis isn't going to happen
for anything but the most dramatic security incidents, and ATLAS
know this.
ATLAS' inability to make their software run with glExec is their
problem, and they need to fix it, or provide a fully equivalent
mechanism. Not bothering because it's too much like hard work is
not an acceptable option.
Ewan
|