Hi all,
We have some WNs here that appear to be running agents for the Condor
system, trying to do work on our WNs in opposition to the Torque/Maui
batch/scheduling system and unknown to it:
atlassgm 16125 1 0 Oct20 ? 00:01:08 condor_master -f
atlassgm 2652 16125 0 Oct22 ? 00:04:25 condor_startd -f
atlassgm 20839 2652 0 Oct25 ? 00:00:48 condor_starter -f
higgs05.cs.wisc.edu
atlassgm 20845 20839 0 Oct25 ? 00:00:00 /bin/sh --login
/pool/4006441.csflnx353.rl.ac.uk/execute.130.246.180.112-16125/dir_20839
/condor_exec.ex
atlassgm 21442 20845 92 Oct25 ? 22:09:41 ./2Qgen
In the above case, jobid 4006441 has been and gone in the batch system.
The big problem appears to be that this is causing grief to Maui which
is refusing to schedule any legitimate work, thus draining the whole
farm.
Anyone else seen this?
This is causing a big hassle: we are terminating all such processing in
order to get our capacity back online.
Martin
Tier1 Systems.
--
------------------------------------------------------------------------
---
Martin Bly | Tier 1/A Systems Admin | Rutherford Appleton
Laboratory
Email: [log in to unmask] Tel: +44|0 1235 446981 Fax: +44|0 1235
446626
------------------------------------------------------------------------
---
|