Hi *,
We got a strange problem with our ResourceBroker.
All submitted jobs got hung up in the state
Current Status: Ready
Status Reason: unavailable
last night.
We saw that the JobController was partly not running. It seemed
CondorG was down. restarting the service via
/etc/init.d/edg-wl-jc restart
lead to error messages in the log file:
/var/edgwl/jobcontrol/log/events.log
------------------------------------
...
12 Nov, 11:58:03 -F- ControllerLoop::run(): Got an unhandled standard exception !!!
12 Nov, 11:58:03 -F- ControllerLoop::run(): Namely: "Syntax error on file "/var/edgwl/jobcontrol/queue.fl" (_file_sequence_t::erasePointer(...)[44])"
...
The only cure to this problem was to move away
/var/edgwl/jobcontrol/queue.fl
Now the JobController comes up without error messages.
Nevertheless we are not sure if everything is okay again.
Any idea?
Thanx
Andreas
++++++++++++++++++
Andreas Gellrich
DESY IT
++++++++++++++++++
|