On 9 December 2013 15:28, Winnie Lacesso <[log in to unmask]> wrote:
> Dear All,
>
> In hopes that someone can help! (not much feedback from lcg-rollout)
> Our site has 3 CEs - 1 SL6 ARC-CE which (Lukas says) uses our argus
> server (he set that up) & is ok; 2 CREAM-CEs which I am not sure use argus
> or not; it looks like from last yaim run that they both do....
>
> Is it easily found in some config file "This CE is using argus" or "This
> CE is not using argus"?
>
> One suddenly started failing 99.999%, in /var/log/cream/glite-ce-cream.log
> are alarming errors:
>
> 08 Dec 2013 12:24:56,452 org.glite.voms.PKIVerifier - Faulty certificate: DC=ch,DC=cern,OU=computers,CN=voms.cern.ch
> 08 Dec 2013 12:24:56,452 org.glite.voms.PKIVerifier - Cannot verify issuer certificate chain for AC
> 08 Dec 2013 12:24:56,467 org.glite.voms.PKIVerifier - CRL for CA 'DC=ch,DC=cern,CN=CERN Trusted Certification Authority' has expired on Sat Dec 07 16:46:00 GMT 2013.
> 08 Dec 2013 12:24:56,482 org.glite.voms.PKIVerifier - No temporally valid CRL for CA 'DC=ch,DC=cern,CN=CERN Trusted Certification Authority' was found. Considering the certificate 'DC=ch,DC=cern,OU=computers,CN=voms.cern.ch' revoked.
>
right, so the fundamental problem is that that CREAM CE no longer
trusts the voms.cern.ch VOMS server, because it can't get up to date
CRLS for its signing CA (the CERN CA, in this case).
Does it complain about any other CA certs having invalid CRLs?
Is the time correctly set on the CE (I assume you checked ntp)?
> The host cream-ce cert is still good
> root@lcgce04> openssl x509 -text -in /etc/grid-security/hostcert.pem | grep -i not
> Not Before: Jan 7 07:09:40 2013 GMT
> Not After : Feb 6 07:09:40 2014 GMT
>
> I can't submit test jobs (the other lcgce03 is fine)
> phpwl@lcgui02> for i in `seq 1 2`; do glite-ce-job-submit -o /tmp/ce04 -a -r lcgce04.phy.bris.ac.uk:8443/cream-pbs-express ci-short.jdl; done
> 2013-12-08 13:13:42,049 FATAL - CN=winnie lacesso,L=IS,OU=Bristol,O=eScience,C=UK not authorized for {http://www.gridsite.org/namespaces/delegation-2}getProxyReq
> 2013-12-08 13:13:42,193 FATAL - CN=winnie lacesso,L=IS,OU=Bristol,O=eScience,C=UK not authorized for {http://www.gridsite.org/namespaces/delegation-2}getProxyReq
>
> root@lcgce04> ls -lt /etc/grid*/cert* | head
> -rw-r--r-- 1 root root 27365 Dec 9 12:27 9dd23746.r0
> -rw-r--r-- 1 root root 784 Dec 9 12:27 da213f5b.r0
> -rw-r--r-- 1 root root 670 Dec 9 12:27 10718cba.r0
> -rw-r--r-- 1 root root 1003 Dec 9 12:27 fc1898ec.r0
> -rw-r--r-- 1 root root 13357 Dec 9 12:27 9ec3a561.r0
>
> That looks pretty current. On 8 Dec I reran fetch-crl --verbose which said
> (grep just for CERN in output)
>
> fetch-crl[1492]: 20131208T135047+0000 processing '/etc/grid-security/certificates/CERN-GridCA.crl_url'
> fetch-crl[1492]: 20131208T135048+0000 updating CRL 'CERN Grid Certification Authority (4339b4bc)'
> fetch-crl[1492]: 20131208T135048+0000 processing '/etc/grid-security/certificates/CERN-Root-2.crl_url'
> fetch-crl[1492]: 20131208T135048+0000 updating CRL 'CERN Root Certification Authority 2 (b4278411)'
> fetch-crl[1492]: 20131208T135048+0000 processing '/etc/grid-security/certificates/CERN-Root.crl_url'
> fetch-crl[1492]: 20131208T135048+0000 updating CRL 'CERN Root CA (d254cc30)'
> fetch-crl[1492]: 20131208T135048+0000 processing '/etc/grid-security/certificates/CERN-TCA.crl_url'
> fetch-crl[1492]: 20131208T135048+0000 updating CRL 'CERN Trusted Certification Authority (1d879c6c)'
>
See, this is the vexing thing, since that should have fixed a pure CRL
problem with those certs, since apparently it updated them...
> but that hasn't helped.
>
> Might this be an argus error? I think our other cream-ce (lcgce03) does
> not use argus; at any rate it is ok. (It's very handy having 2 - when 1
> goes wonky you know it's that one, not the cream-ce in general!)
> Someone else knows/manages the argus server so far, I'm very new to it.
>
> I say 99.999% failing because the cream logfile shows that 5 jobs did
> successfully arrive (all vo.southgrid.ac.uk) & ran yesterday & 10 today so
> far (a mix of vo.southgrid.ac.uk & ilc). So some jobs do make it thru!
Sure, I am betting that none of those VOs are on the voms.cern.ch VOMS
server, so they're fine!
Sam
> Curiouser & curiouser.
>
> Any advice most welcome!
|