Yannick Patois wrote:
> This is not the first time this append to my LCG-RB.
> At some point, it just stop working properly, now submitted jobs stay
> indefinitely in Waiting status, and nothing appends.
>
> What should I do to better understand the problem and hopefully solve it?
Apparently the workload-manager (WM) is not processing its input.fl,
or the jobs do not end up there for some reason.
Please send the output of these commands:
--------------------------------------------------
sh chk-wl
lsof | gzip > /tmp/lsof.gz
grep -c ' g$' /var/edgwl/workload_manager/input.fl
--------------------------------------------------
The "chk-wl" script is attached.
#!/bin/sh
cat /proc/locks
ps -u edguser | sort -k 4 | uniq -f 3 | sed 's/ edg-wl-....[cmal].../& </'
echo === WM ===
tail -3 /var/edgwl/workload_manager/log/events.log
echo === NS ===
tail -3 /var/edgwl/networkserver/log/events.log
echo === JC ===
tail -3 /var/edgwl/jobcontrol/log/events.log
echo === LM ===
tail -3 /var/edgwl/logmonitor/log/events.log
|