Jose del Peso wrote:
> Dear all,
>
> I have tried to submit a job this morning and it failed. The following
> output
> is obtained:
>
> _________________________________________________________
>
> [delpeso@grid010 atlas-simula]$ edg-job-submit --vo atlas testJob_SW.jdl
>
> Selected Virtual Organisation name (from --vo option): atlas
> Connecting to host lxn1188.cern.ch, port 7772
> Logging to host lxn1188.cern.ch, port 9002
> **** Error: API_NATIVE_ERROR ****
> Error while calling the "edg_wll_RegisterJobSync" native api
> Unable to Register the Job:
> https://lxn1188.cern.ch:9000/XN2EtYRRsFLZ-Tsan94D7w
> to the LB logger at: lxn1188.cern.ch:9002
> SSL Error (sslv3 alert handshake failure)
Indeed, it was an error that is new to us: the edg-wl-logd somehow got
an invalid proxy at the time its proxy file was renewed at 08:26.
Exactly 6 hours later the problem went away as the proxy was again renewed.
Unfortunately we did not catch the very proxy used during those 6 hours,
but we did change two things:
- the renewal job now runs every 5 minutes;
- it copies each proxy to an area for later analysis.
The RB uses a disk server to hold most of the WP1 state information,
which means the proxy happens to sit on an NFS; we have had problems
with a few other state files when they were on an NFS, but not with
any of the service proxies. We looked into the code and did not yet
see how it might fail in this respect.
|