Hi,
rb02.lip.pt is experiencing some problems. A user is flooding the
machine completely with continuous jobs submissions, and the average
load of the machine is around 20.
The edg-wl-workload daemon is consuming a lot of CPU and making other
users (and our) lifes difficult
[root@rb02 root]# /opt/condor/bin/condor_q
(...)
1406 jobs; 292 idle, 997 running, 117 held
[root@rb02 root]# /opt/condor/bin/condor_q -long | grep -i
UserSubjectName | grep Carrillo | wc -l
1145
The user is not doing anything wrong... He is just submiting jobs from
his UI where our RB is configured... So, I can not simply ask him not to
do it... The middleware should take care of such situations allowing to
distribute the load by several RBs and not just one... Nevertheless,
being practical, is there something I can do from the RB point of view,
some optimization or something else?!
Thanks in advance
Cheers
Goncalo
|