Dear All,
One of our 2 cream-CEs has gone red submission & very strangely is found
with 0 jobs.
In /var/log/cream/glite-ce-cream.log are alarming errors:
08 Dec 2013 12:24:56,452 org.glite.voms.PKIVerifier - Faulty certificate: DC=ch,DC=cern,OU=computers,CN=voms.cern.ch
08 Dec 2013 12:24:56,452 org.glite.voms.PKIVerifier - Cannot verify issuer certificate chain for AC
08 Dec 2013 12:24:56,467 org.glite.voms.PKIVerifier - CRL for CA 'DC=ch,DC=cern,CN=CERN Trusted Certification Authority' has expired on Sat Dec 07 16:46:00 GMT 2013.
08 Dec 2013 12:24:56,482 org.glite.voms.PKIVerifier - No temporally valid CRL for CA 'DC=ch,DC=cern,CN=CERN Trusted Certification Authority' was found. Considering the certificate 'DC=ch,DC=cern,OU=computers,CN=voms.cern.ch' revoked.
The host cream-ce cert is still good
root@lcgce04> openssl x509 -text -in /etc/grid-security/hostcert.pem | grep -i not
Not Before: Jan 7 07:09:40 2013 GMT
Not After : Feb 6 07:09:40 2014 GMT
I can't submit test jobs (the other lcgce03 is fine)
phpwl@lcgui02> for i in `seq 1 2`; do glite-ce-job-submit -o /tmp/ce04 -a -r lcgce04.phy.bris.ac.uk:8443/cream-pbs-express ci-short.jdl; done
2013-12-08 13:13:42,049 FATAL - CN=winnie lacesso,L=IS,OU=Bristol,O=eScience,C=UK not authorized for {http://www.gridsite.org/namespaces/delegation-2}getProxyReq
2013-12-08 13:13:42,193 FATAL - CN=winnie lacesso,L=IS,OU=Bristol,O=eScience,C=UK not authorized for {http://www.gridsite.org/namespaces/delegation-2}getProxyReq
root@lcgce04> ls -lt /etc/grid*/cert* | head
total 2968
drwxr-xr-x 2 root root 57344 Dec 8 12:28 ./
-rw-r--r-- 1 root root 670 Dec 8 12:28 10718cba.r0
-rw-r--r-- 1 root root 27365 Dec 8 12:28 9dd23746.r0
-rw-r--r-- 1 root root 784 Dec 8 12:28 da213f5b.r0
-rw-r--r-- 1 root root 13357 Dec 8 12:28 9ec3a561.r0
That looks pretty current. I've rerun fetch-crl --verbose which said
(grep just for CERN in output)
fetch-crl[1492]: 20131208T135047+0000 processing '/etc/grid-security/certificates/CERN-GridCA.crl_url'
fetch-crl[1492]: 20131208T135048+0000 updating CRL 'CERN Grid Certification Authority (4339b4bc)'
fetch-crl[1492]: 20131208T135048+0000 processing '/etc/grid-security/certificates/CERN-Root-2.crl_url'
fetch-crl[1492]: 20131208T135048+0000 updating CRL 'CERN Root Certification Authority 2 (b4278411)'
fetch-crl[1492]: 20131208T135048+0000 processing '/etc/grid-security/certificates/CERN-Root.crl_url'
fetch-crl[1492]: 20131208T135048+0000 updating CRL 'CERN Root CA (d254cc30)'
fetch-crl[1492]: 20131208T135048+0000 processing '/etc/grid-security/certificates/CERN-TCA.crl_url'
fetch-crl[1492]: 20131208T135048+0000 updating CRL 'CERN Trusted Certification Authority (1d879c6c)'
but that hasn't helped:
phpwl@lcgui02> glite-ce-allowed-submission lcgce04.phy.bris.ac.uk:8443
2013-12-08 13:52:50,320 ERROR - MethodName=[getServiceInfo] ErrorCode=[0]
Description=[CN=winnie lacesso,L=IS,OU=Bristol,O=eScience,C=UK not authorized
for {http://glite.org/2007/11/ce/cream/types}getServiceInfo]
FaultCause=[Authorization error] Timestamp=[Sun 08 Dec 2013 13:52:50]
Might this be an argus error? I think our other cream-ce (lcgce03) does
not use argus; at any rate it is ok. (It's very handy having 2 - when 1
goes wonky you know it's that one, not the cream-ce in general!)
Someone else knows/manages the argus server so far, I'm very new to it.
Advice welcome!
|