JISCMail - LCG-ROLLOUT Archives

Salut Jean-Michel,

> > This morning our CEs are in UNKNOW state on the NGI-FR's nagiosbox:
> > https://ccnagboxli01.in2p3.fr/nagios/cgi-bin/status.cgi?host=nanlcg04.in2p3.fr&style=detail
> > 
> > I cannot understand what is wrong, other CEs in the NGI do not have this
> > problem.
> > 
> > It is true that early this morning the CRLs were not up-to-date on the
> > worker nodes but I corrected it, hoping that it would fix this issue
> > but nope.
> 
> Are you sure the CRLs are OK now?  The error message is consistent with
> them still being bad:
> 
> > [...]
> > Couldn't write to given LFC prod-lfc-shared-central.cern.ch.
> > Reason:
> > send2nsd: NS002 - send error : Bad credentials

I submitted a test job that could not even start:

******  JobID=[https://nanlcg04.in2p3.fr:8443/CREAM538564428]
        Status        = [DONE-FAILED]
        ExitCode      = [W]
        FailureReason = [Cannot move ISB (retry_copy ${globus_transfer_cmd}
[...]): error: globus_ftp_client: the server responded with an error500
500-Command failed. : callback failed.500-an end-of-file was reached500-
globus_xio: The GSI XIO driver failed to establish a secure connection.
The failure occured during a handshake read.500-globus_xio:
An end of file occurred500 End.; pbs_reason=1]

Since there is no complaint about a CRL, I rather suspect the WN have an
issue with their system time...  Jobs at another site were failing with
that same error when the time was wrong on the WN...