Hi Alastair,
I checked few of worker node it is same user (Takashi Yamanaka) who
has lot of madevent processes.
618858.master02.cm.cluster
618602.master02.cm.cluster
Cheers
Govind
On Thu, Nov 29, 2012 at 11:09 AM, Alastair Dewhurst
<[log in to unmask]> wrote:
> Hi All
>
> A few sites have got in contact with ATLAS cloud support in the last few
> hours complaining that ATLAS are running multi-process pilot jobs. From
> looking at the few examples we have been given it looks like it might be
> just one naughty user however if you have noticed multi-process ATLAS jobs
> running can you let us know. If you can provide the batchsystem ID, for
> example:
> 499570.t2torque03.physics.ox.ac.uk
> we can put it into the panda monitor and identify the type of job:
> http://panda.cern.ch/server/pandamon/query?batchID=499570.t2torque03.physics.ox.ac.uk&job=*
>
> Thanks
>
> Alastair
>
>
>
> Begin forwarded message:
>
> From: <[log in to unmask]>
> Date: 29 November 2012 10:12:22 GMT
> To: <[log in to unmask]>
> Subject: Multiprocess Pilot Jobs
>
> Hi,
>
> I've just found a few pilot jobs on our cluster running multiple processes
> with CPU efficiencies in the 300-1300% range.
>
> For example this one[1] which has used more than 25hrs CPU time in less than
> 2hrs wall time.
>
> What info does Atlas need to find the user and persuade them not to (before
> I decide to ban Atlas pilot jobs - since you're not using glexec so I cannot
> ban the user)
>
> Yours,
> Chris.
>
> [1]
> 2499 ? 00:00:00 18633876.heplnx
> 2504 ? 00:00:00 CREAM476771383_
> 2585 ? 00:00:00 perl
> 2586 ? 00:00:00 sh
> 2589 ? 00:00:00 runpilot3-wrapp
> 2628 ? 00:00:03 python
> 25834 ? 00:00:01 python
> 29031 ? 00:00:00 sh
> 30517 ? 00:00:00 python
> 30692 ? 00:00:00 sh
> 30757 ? 00:00:00 python
> 30758 ? 00:00:04 python
> 30839 ? 00:00:03 athena.py
> 1370 ? 00:00:01 python
> 18954 ? 00:00:00 ajob19
> 7271 ? 00:06:39 madevent
> 19047 ? 00:00:00 ajob12
> 1619 ? 00:11:44 madevent
> 19773 ? 00:00:00 ajob2
> 19778 ? 00:17:07 madevent
> 23134 ? 00:00:00 ajob13
> 23144 ? 00:15:06 madevent
> 25535 ? 00:00:00 ajob15
> 25540 ? 00:16:03 madevent
> 25901 ? 00:00:00 ajob17
> 25906 ? 00:14:49 madevent
> 28014 ? 00:00:00 ajob19
> 28022 ? 00:14:06 madevent
> 28046 ? 00:00:00 ajob14
> 28051 ? 00:13:29 madevent
> 28662 ? 00:00:00 ajob6
> 28675 ? 00:12:31 madevent
> 31446 ? 00:00:00 ajob12
> 31472 ? 00:12:54 madevent
> 32416 ? 00:00:00 ajob18
> 1654 ? 00:11:00 madevent
> 1634 ? 00:00:00 ajob7
> 16793 ? 00:01:43 madevent
> 15731 ? 00:00:00 ajob3
> 19225 ? 00:00:08 madevent
> 15887 ? 00:00:00 ajob2
> 15892 ? 00:02:25 madevent
> 15904 ? 00:00:00 ajob3
> 15909 ? 00:02:24 madevent
> 16130 ? 00:00:00 ajob2
> 19254 ? 00:00:03 madevent
> 17165 ? 00:00:00 ajob2
> 17170 ? 00:01:22 madevent
> 18578 ? 00:00:00 ajob2
> 18586 ? 00:00:47 madevent
> 18600 ? 00:00:00 ajob3
> 18605 ? 00:01:12 madevent
> 18893 ? 00:00:00 ajob2
> 18898 ? 00:00:32 madevent
> 18917 ? 00:00:00 ajob3
> 18922 ? 00:00:42 madevent
> 19165 ? 00:00:00 ajob2
> 19170 ? 00:00:22 madevent
> 19179 ? 00:00:00 ajob3
> 19184 ? 00:00:22 madevent
> 19233 ? 00:00:00 ajob1
> 19246 ? 00:00:05 madevent
> 2587 ? 00:00:00 perl
>
> --
> Chris Brew
> PPD Scientific Computing Manager
> STFC RAL
> Harwell, Oxford
> Didcot, Oxfordshire, UK
> OX11 0QX
> Tel: 01235 446326
>
>
>
|