Here's a few at Lancaster (from our torque cluster. Bad jobs running on
our LSF cluster were killed and seemed to not be in panda).
2655059.fal-pygrid-44.lancs.ac.uk
2654764.fal-pygrid-44.lancs.ac.uk
2654623.fal-pygrid-44.lancs.ac.uk
2654425.fal-pygrid-44.lancs.ac.uk
All point to the same user, whose initials may or may not be T.Y.
Cheers,
Matt
On 11/29/2012 11:40 AM, Stephen Jones wrote:
> Here's a few more for you, at Liverpool.
>
> 348880.mace.ph.liv.ac.uk
> 349088.mace.ph.liv.ac.uk
> 349445.mace.ph.liv.ac.uk
> 349102.mace.ph.liv.ac.uk
> 349498.mace.ph.liv.ac.uk
>
> Steve
>
>
> On 11/29/2012 11:35 AM, Peter Gronbech wrote:
>>
>> All our jobs are the same user:
>>
>> http://panda.cern.ch/server/pandamon/query?batchID=499759.t2torque03.physics.ox.ac.uk&job=*
>> <http://panda.cern.ch/server/pandamon/query?batchID=499759.t2torque03.physics.ox.ac.uk&job=*>
>>
>>
>> http://panda.cern.ch/server/pandamon/query?batchID=499016.t2torque03.physics.ox.ac.uk&job=*
>> <http://panda.cern.ch/server/pandamon/query?batchID=499016.t2torque03.physics.ox.ac.uk&job=*>
>>
>>
>> http://panda.cern.ch/server/pandamon/query?batchID=498354.t2torque03.physics.ox.ac.uk&job=*
>> <http://panda.cern.ch/server/pandamon/query?batchID=498354.t2torque03.physics.ox.ac.uk&job=*>
>>
>>
>> http://panda.cern.ch/server/pandamon/query?batchID=498817.t2torque03.physics.ox.ac.uk&job=*
>> <http://panda.cern.ch/server/pandamon/query?batchID=498817.t2torque03.physics.ox.ac.uk&job=*>
>>
>>
>> http://panda.cern.ch/server/pandamon/query?batchID=495630.t2torque03.physics.ox.ac.uk&job=*
>> <http://panda.cern.ch/server/pandamon/query?batchID=495630.t2torque03.physics.ox.ac.uk&job=*>
>>
>>
>> http://panda.cern.ch/server/pandamon/query?batchID=496070.t2torque03.physics.ox.ac.uk&job=*
>> <http://panda.cern.ch/server/pandamon/query?batchID=496070.t2torque03.physics.ox.ac.uk&job=*>
>>
>>
>> http://panda.cern.ch/server/pandamon/query?batchID=499270.t2torque03.physics.ox.ac.uk&job=*
>> <http://panda.cern.ch/server/pandamon/query?batchID=499270.t2torque03.physics.ox.ac.uk&job=*>
>>
>>
>> http://panda.cern.ch/server/pandamon/query?batchID=499837.t2torque03.physics.ox.ac.uk&job=*
>> <http://panda.cern.ch/server/pandamon/query?batchID=499837.t2torque03.physics.ox.ac.uk&job=*>
>>
>>
>> --
>>
>> ----------------------------------------------------------------------
>>
>> Peter GronbechGridPP Project ManagerTel No. : 01865 273389
>>
>> Fax No. : 01865 273418
>>
>> Department of Particle Physics,
>>
>> University of Oxford,
>>
>> Keble Road, OxfordOX1 3RH, UKE-mail : [log in to unmask]
>> <mailto:[log in to unmask]>
>>
>> ----------------------------------------------------------------------
>>
>> *From:*Testbed Support for GridPP member institutes
>> [mailto:[log in to unmask]] *On Behalf Of *Alastair Dewhurst
>> *Sent:* 29 November 2012 11:10
>> *To:* [log in to unmask]
>> *Subject:* Multiprocess Pilot Jobs
>>
>> Hi All
>>
>> A few sites have got in contact with ATLAS cloud support in the last
>> few hours complaining that ATLAS are running multi-process pilot jobs.
>> From looking at the few examples we have been given it looks like it
>> might be just one naughty user however if you have noticed
>> multi-process ATLAS jobs running can you let us know. If you can
>> provide the batchsystem ID, for example:
>>
>> 499570.t2torque03.physics.ox.ac.uk
>>
>> we can put it into the panda monitor and identify the type of job:
>>
>> http://panda.cern.ch/server/pandamon/query?batchID=499570.t2torque03.physics.ox.ac.uk&job=*
>> <http://panda.cern.ch/server/pandamon/query?batchID=499570.t2torque03.physics.ox.ac.uk&job=*>
>>
>>
>> Thanks
>>
>> Alastair
>>
>> Begin forwarded message:
>>
>>
>>
>> *From: *<[log in to unmask] <mailto:[log in to unmask]>>
>>
>> *Date: *29 November 2012 10:12:22 GMT
>>
>> *To: *<[log in to unmask]
>> <mailto:[log in to unmask]>>
>>
>> *Subject: Multiprocess Pilot Jobs*
>>
>> Hi,
>>
>> I've just found a few pilot jobs on our cluster running multiple
>> processes
>> with CPU efficiencies in the 300-1300% range.
>>
>> For example this one[1] which has used more than 25hrs CPU time in
>> less than
>> 2hrs wall time.
>>
>> What info does Atlas need to find the user and persuade them not to
>> (before
>> I decide to ban Atlas pilot jobs - since you're not using glexec so I
>> cannot
>> ban the user)
>>
>> Yours,
>> Chris.
>>
>> [1]
>> 2499 ? 00:00:00 18633876.heplnx
>> 2504 ? 00:00:00 CREAM476771383_
>> 2585 ? 00:00:00 perl
>> 2586 ? 00:00:00 sh
>> 2589 ? 00:00:00 runpilot3-wrapp
>> 2628 ? 00:00:03 python
>> 25834 ? 00:00:01 python
>> 29031 ? 00:00:00 sh
>> 30517 ? 00:00:00 python
>> 30692 ? 00:00:00 sh
>> 30757 ? 00:00:00 python
>> 30758 ? 00:00:04 python
>> 30839 ? 00:00:03 athena.py
>> 1370 ? 00:00:01 python
>> 18954 ? 00:00:00 ajob19
>> 7271 ? 00:06:39 madevent
>> 19047 ? 00:00:00 ajob12
>> 1619 ? 00:11:44 madevent
>> 19773 ? 00:00:00 ajob2
>> 19778 ? 00:17:07 madevent
>> 23134 ? 00:00:00 ajob13
>> 23144 ? 00:15:06 madevent
>> 25535 ? 00:00:00 ajob15
>> 25540 ? 00:16:03 madevent
>> 25901 ? 00:00:00 ajob17
>> 25906 ? 00:14:49 madevent
>> 28014 ? 00:00:00 ajob19
>> 28022 ? 00:14:06 madevent
>> 28046 ? 00:00:00 ajob14
>> 28051 ? 00:13:29 madevent
>> 28662 ? 00:00:00 ajob6
>> 28675 ? 00:12:31 madevent
>> 31446 ? 00:00:00 ajob12
>> 31472 ? 00:12:54 madevent
>> 32416 ? 00:00:00 ajob18
>> 1654 ? 00:11:00 madevent
>> 1634 ? 00:00:00 ajob7
>> 16793 ? 00:01:43 madevent
>> 15731 ? 00:00:00 ajob3
>> 19225 ? 00:00:08 madevent
>> 15887 ? 00:00:00 ajob2
>> 15892 ? 00:02:25 madevent
>> 15904 ? 00:00:00 ajob3
>> 15909 ? 00:02:24 madevent
>> 16130 ? 00:00:00 ajob2
>> 19254 ? 00:00:03 madevent
>> 17165 ? 00:00:00 ajob2
>> 17170 ? 00:01:22 madevent
>> 18578 ? 00:00:00 ajob2
>> 18586 ? 00:00:47 madevent
>> 18600 ? 00:00:00 ajob3
>> 18605 ? 00:01:12 madevent
>> 18893 ? 00:00:00 ajob2
>> 18898 ? 00:00:32 madevent
>> 18917 ? 00:00:00 ajob3
>> 18922 ? 00:00:42 madevent
>> 19165 ? 00:00:00 ajob2
>> 19170 ? 00:00:22 madevent
>> 19179 ? 00:00:00 ajob3
>> 19184 ? 00:00:22 madevent
>> 19233 ? 00:00:00 ajob1
>> 19246 ? 00:00:05 madevent
>> 2587 ? 00:00:00 perl
>>
>> --
>> Chris Brew
>> PPD Scientific Computing Manager
>> STFC RAL
>> Harwell, Oxford
>> Didcot, Oxfordshire, UK
>> OX11 0QX
>> Tel: 01235 446326
>>
>
>
|