Markus Schulz wrote:
> Dear Henry,
> I have a few comments and a few questions.
>
> As you stated correctly each and every sgm can move code to any of her
> VO's vo-boxes and
> run the code as a mapped user.
> I don't see a fundamental difference between this and the ability to
> run long running jobs on a batch system through the grid.
> In both cases the sysadmin has no effective control over what code the
> users run and a 48h job with external network connectivity is
> not so different from what the users can do on a VO box.
I think there's a clear expectation on the part of sites of what users
will do with an experiment's software stack. Running a job that
consisted of "apt-get install mysql" would rightly cause most site
admins to have a fit.
There's a brave attempt being made to say that VO boxes are no different
to long running jobs, but it's just not really true - Steve makes the
point about listening services, which is a fundamental difference. And
the "startup on boot" is seriously different too.
It's fundamentally a significant loss of control about the services
which sites run and, more importantly, control. Yes, in LCG site admins
generally want to be helpful and want their sites to be used by LCG VOs
(that's why we install LCG middleware at all); but we do have to abide
by local security regulations (this is in the model, isn't it?). Having
external users allowed to install listening, restarting services is over
the mark for many sites - a boundary which it is not theirs to control.
> Since the
> farms give by one way external network access the VOs could implement
> with a bit of
> additional complication their service like programs as a series of long
> running jobs that use the local SE for keeping the state.
> From the security point of view I can see no difference between giving
> access to a WN or the VO box. In both cases the users can be traced and
> are mapped to a local user. In both cases the user can bring non
> security reviewed software to the site.
>
> The question that I have are of practical matter:
> You mention that you have to make sure that you are responsible that
> people don't misuse the service (as an example you
> mention the storage of ripped movies).
> How do you ensure this? Are you in control of what the users store on
> your site and what software they run?
But if you find a ripped movie you can identify the compromised
certificate used (you run the gridftp server) to upload it and take
appropriate action. How can VOs offer us the same guarantees of
accoutability if their box was used for such a nefarious purpose? The
implications of a VO software manager's certificate being compromised
are quite horrendous. To quote from
https://uimon.cern.ch/twiki/bin/view/Atlas/DDMSc3:
"The ATLAS Distributed Data Management will install the following set of
components in the site VO box.
[...]
* Claims service and Space Management service - The claims service
runs on an apache server [...] It will be contacted from within the VO
box and outside via http(s) requests (currently ports 443 and 80).
[...]
Connectivity
* Login as root via ssh/gsi"
Really, I echo Steven's call earlier in this thread: we need to identify
the missing middleware components which the VOs require and implement
them in LCG.
If we don't do this, and the VO boxes become standard, then middleware
development might well grind to a halt as experiments know they can take
the path of least resistance and just startup another service on their
VO box. That's a pretty impoverished view of a grid in anyone's eyes.
Cheers
Graeme
--
--------------------------------------------------------------------
Dr Graeme Stewart http://www.physics.gla.ac.uk/~graeme/
GridPP DM Wiki http://wiki.gridpp.ac.uk/wiki/Data_Management
|