On Wed, Nov 07, 2007 at 02:19:46PM -0000, Jensen, J (Jens) wrote: > Since this user ought to have stopped this activity over 7 days ago, I've > traced all interactions of him with our SE to their originating hosts, > since 20071105030430.985562Z, so just my last two log files. There > are still, till 2150 UTC on November 6th 543 TYPE=STOR operations, > in about 42 hours, involving 75 hosts spread all over the grid. Indeed > most of the UK ones have stopped, except for: > > dgc-grid-40.brunel.ac.uk 2 > dgc-grid-44.brunel.ac.uk 4 > fal-pygrid-19.lancs.ac.uk 16 > lcg.shef.ac.uk 8 > wd44.hep.ph.ic.ac.uk 20 For wd44 the node was out of the batch system (before the incident) but some jobs where left running. Since the batch system was not there to enforce wallclock/cpu time and since the biomed job was a pilot one none of the individual processes hit the cpu time limit :( Lesson learned, if the batch system is not there to kill jobs don't expect them to ever end.... Kostas