Hi,
For the last week or two we have had intermittent problems with (ATLAS)
jobs failing with what seems to be lfc connection problems
Some logfile extracts follow:
"""
Failed to get LFC replicas: -1 (lfc_getreplicas failed with: 2704, Bad
magic number)
"""
and
"""
13 Oct 07:27:03| lcgcpSiteMov| !!WARNING!!2990!! LFC setup and mkdir
failed. Status=256 Output=LFC_HOST=atlas-lfc-fzk.gridka.de
send2nsd: NS002 - send error : _Csec_recv_token: Received magic: 30e1301
expecting ca03
send2nsd: NS002 - send error : _Csec_recv_token: Received magic: 30e1301
expecting ca03
"""
We've tested locally and so far I cannot recreate the problem.
I've done lfc-ls and lfc-mkdir and I've ran lfc-ls basically
simultaneously on all of our nodes and I didn't see the problem.
I just set CSEC_TRACE=1 on a bunch of our nodes to see if we can catch
the problem and get more info..
google managed to give me some logfiles etc where the same problem
popped up but nothing resembling a fix.
Has anyone seen this or does anyone have a hint for us?
cheers
John
--
+------------------------------------------------------------+
|Dr. John Alan Kennedy Rechenzentrum Garching (RZG) |
|Mail: [log in to unmask] Boltzmannstrasse 2 |
|Phone: +49 89 3299 2694 85748 Garching |
|Fax: +49 89 3299 1301 |
+------------------------------------------------------------+
|