Print

Print


Hi *,
We got a strange problem with our ResourceBroker.

All submitted jobs got hung up in the state

  Current Status:     Ready
  Status Reason:      unavailable

last night.

We saw that the JobController was partly not running. It seemed
CondorG was down. restarting the service via

  /etc/init.d/edg-wl-jc restart

lead to error messages in the log file:

/var/edgwl/jobcontrol/log/events.log
------------------------------------
...
12 Nov, 11:58:03 -F- ControllerLoop::run(): Got an unhandled standard exception !!!
12 Nov, 11:58:03 -F- ControllerLoop::run(): Namely: "Syntax error on file "/var/edgwl/jobcontrol/queue.fl" (_file_sequence_t::erasePointer(...)[44])"
...

The only cure to this problem was to move away

  /var/edgwl/jobcontrol/queue.fl

Now the JobController comes up without error messages.

Nevertheless we are not sure if everything is okay again.


Any idea?

Thanx
Andreas

++++++++++++++++++
 Andreas Gellrich
 DESY IT
++++++++++++++++++