It's not just the UK being frustrated - perhaps the ROC managers could discuss
this and assess the "damage".
Some of the expressions of frustration I have heard from security people are also
about the EGEE management who authorised this. Maybe someone - perhaps the
GDB? - needs to send a message on behalf of all (frustrated) ROCs?
One question I have in this connection is whether an estimate of the required
CPU and storage resources was made and whether snr mgmt had seen that? Did
they have any idea how much CPU time an RSA768 factorisation would require?
I have also recommended to the storage group that sites check which SEs have
heinzian data in them, block access, but do not delete data. Not much feedback
yet though.
Apart from agreeing expressions of displeasure, personally I think we should
learn the lessons and move on. Incidents happen, and this one shows we are not
really sufficiently well prepared for a more serious one.
Cheers
--jens
-----Original Message-----
From: Testbed Support for GridPP member institutes on behalf of Coles, J (Jeremy)
Sent: Thu 08/11/2007 01:19
To: [log in to unmask]
Subject: Re: Heinz still submitting
Dear All
There are several areas being followed up in response to the recent
biomed events. There is now a confirmation that the jobs being
resubmitted by rb01.egee-see.org have been killed, so you should not see
any more activity.
I did not receive any definitive evidence from any UKI site that Heinz
was still intentionally submitting. He indicated that he tried to cancel
all jobs using edg-job-cancel --all but apparently this command does not
function correctly! As has been indicated previously each site needs to
take whatever action it sees as appropriate and this may include
blocking access to the data stored on the site SE. However, at this
stage please *do not* delete Heinz's output data.
I would also appreciate if we could work together in expressing the
frustration/annoyance felt in UKI in this case and not resort to
individual messages to Heinz. While it is clear the work was not
appropriate to biomed, Heinz did seek EGEE approval to carry out the
work and he did not try to hide it (he used clear job naming to be
transparent in what he was doing). Furthermore, biomed as a VO may not
now be aware of discussions to use the VO because the representative who
did know moved on. Given these (and other) additional factors, while
Heinz is at fault for not working within the VO AUP, there are wider
things to consider in our overall response. The jos have stopped and the
matter has been sufficiently escalated, so I at least think we should
move on. Since the situation was not entirely Heinz's fault I would be
inclined to give him the output of completed jobs so that the CPU time
used was not a complete waste.
Regards,
Jeremy
> -----Original Message-----
> From: Testbed Support for GridPP member institutes [mailto:TB-
> [log in to unmask]] On Behalf Of Ma, M (Mingchao)
> Sent: 07 November 2007 15:34
> To: [log in to unmask]
> Subject: Re: Heinz still submitting
>
> Hi all,
>
> Just received direct confirmation from Heinz. Please refer to the
copied
> email below:
>
> > Hi Mingcaho,
> >
> > Initially, I just stopped job submission but instructed the
> > Task Server not to assign any further tasks to jobs. A job
> > picks up a new assignment typically after about 90 mins. If
> > no assignment is found, the job terminates gracefully.
> >
> > I tried to cancel potentially existing with edg-job-cancel
> > --all but this gives timeout connection errors with the RB
> > rb01.egee-see.org. Can somebody please help clear the jobs
> > from the RB in case there are still some around?
> >
> > Thanks,
> > Heinz
> >
>
> So the user can not cancel his job, therefore the RB still try to
submit
> the job to other sites. OSCT-DC and OSCT SEE ROC will follow it up and
> contact the admins of rb01.egee-see.org (which belongs to HG-06-EKT
site
> at Greek). For now, it seems that the user does stop job submission
once
> he has been asked to do so.
>
> Mingchao
>
>
> > -----Original Message-----
> > From: Testbed Support for GridPP member institutes
> > [mailto:[log in to unmask]] On Behalf Of David Robson
> > Sent: 07 November 2007 14:58
> > To: [log in to unmask]
> > Subject: Heinz still submitting
> >
> > One of Heinz's gnfs-lasieve jobs started 45 minutes ago
> > (14:13) on EFDA-JET.
> > The connection was from rb01.egee-see.org We are now banning
> > this user.
> >
> > --
> > David Robson
> >
> > CODAS, Machine Operations, UKAEA Culham Division Culham
> > Science Centre, Abingdon, OXON, OX14 3DB, UK
> > Voice: +44(0)1235-46-4569, Fax: 4404
> > Work email: [log in to unmask]
> > Home email: [log in to unmask]
> >
|