On Mon, 3 Sep 2007, Kyriakos Ginis wrote:
> Hello,
>
> When trying to submit to our rb I get:
>
> Selected Virtual Organisation name (from proxy certificate extension):
> dteam
> Connecting to host rb.isabella.grnet.gr, port 7772
> Logging to host rb.isabella.grnet.gr, port 9002
> **** Error: API_NATIVE_ERROR ****
> Error while calling the "edg_wll_RegisterJobSync" native api
> Unable to Register the Job:
> https://rb.isabella.grnet.gr:9000/7R4nqOEcMRt-dPIHbPAI9Q
> to the LB logger at: rb.isabella.grnet.gr
> Resource temporarily unavailable (Resource temporarily unavailable -
> edg_wll_log_proto_client: Error get answer, timeout expired;)
>
>
> This error usually has to do with MySQL tables reaching 4GBs, but we
> have removed the limitation by enlarging the tables sometime ago, and
> long_fields and short_fields are currently sized about 5-6 GBs each.
>
> The problem appeared after a period of very high RB load (~2000 jobs
> reported by condor_q). Condor_q now reports about ~130 jobs.
>
> I have restarted the edg-wl-* services many times to no effect.
>
> In /var/log/messages I see:
>
> Sep 3 04:47:52 rb edg-wl-interlogd[7789]: error reading server
> rb.isabella.grnet.gr reply: get_reply (header)
> Sep 3 04:47:52 rb edg-wl-interlogd[7789]: queue_thread: get_reply
> (header)
> Sep 3 01:50:53 rb edg-wl-bkserverd[7797]: File exists (duplicate event)
> Sep 3 05:04:50 rb edg-wl-interlogd[7789]: error reading server
> rb.isabella.grnet.gr reply: get_reply (header)
> Sep 3 05:04:50 rb edg-wl-interlogd[7789]: queue_thread: get_reply
> (header)
> Sep 3 02:07:51 rb edg-wl-bkserverd[7882]: File exists (duplicate event)
> Sep 3 05:15:45 rb edg-wl-bkserverd[11785]: Connection timed out
> ((null))
Did you look into cleaning up /var/tmp as suggested here:
http://goc.grid.sinica.edu.tw/gocwiki/Resource_temporarily_unavailable_-_from_locallogger
|