Hi Matt
it's only comp**.private.dns.zone WN's.
There is no problem with wn*** wn's.
Perhaps you can compare settings for these two clusters.
http://panda.cern.ch/server/pandamon/query?jobsummary=site&site=UKI-NORTHGRID-LANCS-HEP
Also the problem is seen for prod jobs only.
Perhaps it's a problem with user DN on one of ce's?
Cheers,
Elena
On 30 Aug 2012, at 17:08, Matt Doidge wrote:
> Hello,
> First up sorry for the cross post, my apologies to all who end up getting this message twice. Desperate times create desperate admin.
>
> Lancaster's having a bunch of atlas production jobs failing with the unhelpful error message:
> Get error: Failed to get LFC replicas: -1 (lfc_getreplicas failed with: 2702, Bad credentials)
>
> By a bunch I mean 95% of all jobs that one on one of our clusters (the infamous HEC, abaddon.hec.lancs.ac.uk). Our other cluster is working fine.
>
> A link to one of the failures:
> http://panda.cern.ch/server/pandamon/query?job=1589012289
>
> Various sources from google suggested a few fixes, such as checking for clock skew (there wasn't any), checking the CA certificates on the workers (they seem okay, I redistributed them just in case) and checking the load on the NAT (which is fine, nothing odd going on there that I can see, in fact as we're in test mode things are very quiet). Other cases of this error message suggested problems at the LFC end, but as things are working for our other cluster and everyone else I don't thing this is the case.
>
> Has anyone been plagued by these or similar errors?
>
> Thanks in advance all,
> Matt
__________________________________________________
Dr Elena Korolkova
Email: [log in to unmask]
Tel.: +44 (0)114 2223553
Fax: +44 (0)114 2223555
Department of Physics and Astronomy
University of Sheffield
Sheffield, S3 7RH, United Kingdom
|