Steve Traylen wrote:
> Hi,
>
> I've not seen this for a while where all jobs submitted to RB
> remain in the "Waiting" state for ever.
>
> It had apparently gone a away with recent version of resource broker
> code.
>
> I've restarted every service there making sure they really are dead but
> have had no joy.
Jobs are waiting when the Workload Manager has not got to them yet,
meaning they still sit in /var/edgwl/workload_manager/input.fl.
In the past this could happen when there was a deadlock on the file,
which is also used by the Network Server; check with:
cat /proc/locks
Recently we have seen that the matchmaking can become very slow,
due to the BDII having a slapd cache size that is too small
(fixed in LCG-2_3_1): can you check that setting?
Does /var/edgwl/workload_manager/log/events.log show activity?
|