Hello,
When trying to submit to our rb I get:
Selected Virtual Organisation name (from proxy certificate extension):
dteam
Connecting to host rb.isabella.grnet.gr, port 7772
Logging to host rb.isabella.grnet.gr, port 9002
**** Error: API_NATIVE_ERROR ****
Error while calling the "edg_wll_RegisterJobSync" native api
Unable to Register the Job:
https://rb.isabella.grnet.gr:9000/7R4nqOEcMRt-dPIHbPAI9Q
to the LB logger at: rb.isabella.grnet.gr
Resource temporarily unavailable (Resource temporarily unavailable -
edg_wll_log_proto_client: Error get answer, timeout expired;)
This error usually has to do with MySQL tables reaching 4GBs, but we
have removed the limitation by enlarging the tables sometime ago, and
long_fields and short_fields are currently sized about 5-6 GBs each.
The problem appeared after a period of very high RB load (~2000 jobs
reported by condor_q). Condor_q now reports about ~130 jobs.
I have restarted the edg-wl-* services many times to no effect.
In /var/log/messages I see:
Sep 3 04:47:52 rb edg-wl-interlogd[7789]: error reading server
rb.isabella.grnet.gr reply: get_reply (header)
Sep 3 04:47:52 rb edg-wl-interlogd[7789]: queue_thread: get_reply
(header)
Sep 3 01:50:53 rb edg-wl-bkserverd[7797]: File exists (duplicate event)
Sep 3 05:04:50 rb edg-wl-interlogd[7789]: error reading server
rb.isabella.grnet.gr reply: get_reply (header)
Sep 3 05:04:50 rb edg-wl-interlogd[7789]: queue_thread: get_reply
(header)
Sep 3 02:07:51 rb edg-wl-bkserverd[7882]: File exists (duplicate event)
Sep 3 05:15:45 rb edg-wl-bkserverd[11785]: Connection timed out
((null))
Any ideas?
Thanks.
--
Kyriakos Ginis
|