Some more information... the service which is causing the problem is
lcg-mon-job-status. If I restart it, it works, but only for some time...
Here's the log for the service:
[root@rb01 root]# tail /opt/lcg/var/lcg-mon-job-status.log
2005-08-05 10:30:00,835: [ERROR] Error creating primary producer
2005-08-05 10:30:05,832: [ERROR] Producer died. Trying to start a new
one...
2005-08-05 10:30:05,834: [ERROR] Failed to insert tuples! Retrying in 5
seconds...
2005-08-05 10:30:05,835: [ERROR] Error creating primary producer
2005-08-05 10:30:10,832: [ERROR] Producer died. Trying to start a new
one...
2005-08-05 10:30:10,834: [ERROR] Failed to insert tuples! Retrying in 5
seconds...
2005-08-05 10:30:10,835: [ERROR] Error creating primary producer
2005-08-05 10:30:15,832: [ERROR] Producer died. Trying to start a new
one...
2005-08-05 10:30:15,834: [ERROR] Failed to insert tuples! Retrying in 5
seconds...
2005-08-05 10:30:15,835: [ERROR] Error creating primary producer
Any ideas?
Thanks
Carlos
==========================================================================
Carlos Borrego Iglesias PIC (Port d'Informació Científica)
tel: +34 93 581 3308 Campus UAB - Edifici D
e-mail: [log in to unmask] E-08193 Bellaterra
==========================================================================
On Fri, 5 Aug 2005, Carlos Borrego Iglesias wrote:
> Hi all,
> After updating our RB to 2.6 al services seemed to work fine, but after some
> time jobs can't be registered. If I submit a job I get this error:
>
> [[log in to unmask]]#edg-job-submit --resource
> ifaece01.pic.es:2119/jobmanager-lcgpbs-dteam --vo dteam testJob.jdl
>
> Selected Virtual Organisation name (from --vo option): dteam
> Connecting to host rb01.pic.es, port 7772
> Logging to host rb01.pic.es, port 9002
> **** Error: API_NATIVE_ERROR ****
> Error while calling the "edg_wll_RegisterJobSync" native api
> Unable to Register the Job:
> https://rb01.pic.es:9000/kRqop9BwDW5a0v-pzdiTXQ
> to the LB logger at: rb01.pic.es:9002
> Resource temporarily unavailable (Resource temporarily unavailable -
> edg_wll_log_proto_client: Error get answer, timeout expired;)
>
> If I reconfigure the RB things seem to work again, but after some time they
> fail.
>
> Has anyone seen this before?
> Thanks!
> Carlos
>
> ==========================================================================
> Carlos Borrego Iglesias PIC (Port d'Informació Científica)
> tel: +34 93 581 3308 Campus UAB - Edifici D
> e-mail: [log in to unmask] E-08193 Bellaterra
> ==========================================================================
|