Hello!
A few days ago we at Lancaster reinstalled the shared cluster, and part
of that process involved moving to a new queue name.
And this little change has caused a lot of woe. Despite me being able to
directly submit jobs without problem to the CE we're failing Ops jobs
("no compataible resources) and atlas jobs don't seem to be making their
way to us.
Alessandra wisely suggested that this is a tob BDII caching problem,
which looks right. An lcg-infosites[1] against UKI-NORTHGRID-LANCS-HEP
shows the new queue ("grid") as well as the old queue (with the obscure
historical name "hex", which was one of the reason we wanted to change
it). The new "grid" queue has free slots published, the dead hex queue
doesn't.
AFAICS our own bdii[2] is free of references to the old queue. I'd like
to get on top of this today rather then just wait and see if the problem
goes away- with it being Friday I'd rather not lose out a weekend of
jobs. Any suggestions welcome - I'm open to the fact that something
might be still broken with our CE or its publishing, but I have no idea
what!
Thanks in advance,
Matt
[1] I know lcg-infosites is kinda defunct, but it's so much easier then
figuring out the correct ldap query!
[2]fal-pygrid-14.lancs.ac.uk
|