Hi,
Following up the cvmfs discussion from the meeting , here are some of the errors we saw when
our squid went offline.
LHCb saw errors propagated up to their software, a snippet of which are in
https://ggus.eu/ws/ticket_info.php?ticket=73590, but the key one is this:
ERROR: OSError: [Errno 11] Resource temporarily unavailable: '/cvmfs/lhcb.cern.ch/lib/lhcb/GAUSS'
I saw similar errors simply 'ls'ing the files at the commandline, and the
errors were returned essentially instantly, not after a timeout. However,
the problems weren't completely reliable - an identical access a few
minutes later may succeed, and then fail again a few minutes after that.
While all this was going on things like the following were appearing in
the system logs. I think the key bit is probably:
cvmfs2: unable to load checksum from /.cvmfspublished (7), going to offline mode
but I've included a fuller section here:
messages.1.gz:Aug 17 11:39:30 t2wn50 cvmfs2: Checksum does not match for /lib/install_project.py (SHA1: dd8d245be80906e2a2e2270fb43e1ca5631b0aa6). I'll retry download with no-cache
messages.1.gz:Aug 17 11:39:30 t2wn50 cvmfs2: switch proxy / retry on http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb/data/dd/8d245be80906e2a2e2270fb43e1ca5631b0aa6
messages.1.gz:Aug 17 11:39:30 t2wn50 cvmfs2: switch proxy / retry on http://cvmfs-stratum-one.cern.ch/opt/lhcb/data/dd/8d245be80906e2a2e2270fb43e1ca5631b0aa6
messages.1.gz:Aug 17 11:39:30 t2wn50 cvmfs2: failed to fetch /lib/install_project.py (SHA1: dd8d245be80906e2a2e2270fb43e1ca5631b0aa6)
messages.1.gz:Aug 17 11:39:30 t2wn50 cvmfs2: failed to open /lib/install_project.py, CAS key dd8d245be80906e2a2e2270fb43e1ca5631b0aa6, error code 115
messages.1.gz:Aug 17 11:39:47 t2wn50 cvmfs2: switch proxy / retry on http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb/data/d4/91213099a78364431f09b64516b5e524ebbf59C
messages.1.gz:Aug 17 11:39:47 t2wn50 cvmfs2: switch proxy / retry on http://cvmfs-stratum-one.cern.ch/opt/lhcb/data/d4/91213099a78364431f09b64516b5e524ebbf59C
messages.1.gz:Aug 17 11:39:47 t2wn50 cvmfs2: unable to load catalog from /data/d4/91213099a78364431f09b64516b5e524ebbf59C, going to offline mode
messages.1.gz:Aug 17 11:39:47 t2wn50 cvmfs2: possible data corruption while trying to retrieve catalog from http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb/lib/lhcb/GAUSS, trying with no-cache
messages.1.gz:Aug 17 11:39:47 t2wn50 cvmfs2: switch proxy / retry on http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb/data/d4/91213099a78364431f09b64516b5e524ebbf59C
messages.1.gz:Aug 17 11:39:47 t2wn50 cvmfs2: switch proxy / retry on http://cvmfs-stratum-one.cern.ch/opt/lhcb/data/d4/91213099a78364431f09b64516b5e524ebbf59C
messages.1.gz:Aug 17 11:39:47 t2wn50 cvmfs2: unable to load catalog from /data/d4/91213099a78364431f09b64516b5e524ebbf59C, going to offline mode
messages.1.gz:Aug 17 11:39:47 t2wn50 cvmfs2: data corruption while trying to retrieve catalog from http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb/lib/lhcb/GAUSS
messages.1.gz:Aug 17 11:43:15 t2wn50 cvmfs2: CernVM-FS: unmounted /cvmfs/lhcb.cern.ch (http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb)
messages.1.gz:Aug 17 19:54:30 t2wn50 cvmfs2: switch proxy / retry on http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb/.cvmfspublished
messages.1.gz:Aug 17 19:54:30 t2wn50 cvmfs2: switch proxy / retry on http://cvmfs-stratum-one.cern.ch/opt/lhcb/.cvmfspublished
messages.1.gz:Aug 17 19:54:30 t2wn50 cvmfs2: unable to load checksum from /.cvmfspublished (7), going to offline mode
messages.1.gz:Aug 17 19:54:30 t2wn50 cvmfs2: catalog load failure while try to retrieve catalog from http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb
messages.1.gz:Aug 17 19:54:30 t2wn50 cvmfs2: CernVM-FS: linking /cvmfs/lhcb.cern.ch to remote directoy http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb
messages.1.gz:Aug 17 19:59:31 t2wn50 cvmfs2: CernVM-FS: unmounted /cvmfs/lhcb.cern.ch (http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb)
messages.1.gz:Aug 17 20:10:16 t2wn50 cvmfs2: switch proxy / retry on http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb/.cvmfspublished
messages.1.gz:Aug 17 20:10:17 t2wn50 cvmfs2: switch proxy / retry on http://cvmfs-stratum-one.cern.ch/opt/lhcb/.cvmfspublished
messages.1.gz:Aug 17 20:10:17 t2wn50 cvmfs2: unable to load checksum from /.cvmfspublished (7), going to offline mode
messages.1.gz:Aug 17 20:10:17 t2wn50 cvmfs2: catalog load failure while try to retrieve catalog from http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb
messages.1.gz:Aug 17 20:10:17 t2wn50 cvmfs2: CernVM-FS: linking /cvmfs/lhcb.cern.ch to remote directoy http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb
messages.1.gz:Aug 17 20:15:46 t2wn50 cvmfs2: CernVM-FS: unmounted /cvmfs/lhcb.cern.ch (http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb)
messages.1.gz:Aug 17 21:42:23 t2wn50 cvmfs2: switch proxy / retry on http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb/.cvmfspublished
messages.1.gz:Aug 17 21:42:23 t2wn50 cvmfs2: switch proxy / retry on http://cvmfs-stratum-one.cern.ch/opt/lhcb/.cvmfspublished
messages.1.gz:Aug 17 21:42:23 t2wn50 cvmfs2: unable to load checksum from /.cvmfspublished (7), going to offline mode
messages.1.gz:Aug 17 21:42:23 t2wn50 cvmfs2: catalog load failure while try to retrieve catalog from http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb
messages.1.gz:Aug 17 21:42:23 t2wn50 cvmfs2: CernVM-FS: linking /cvmfs/lhcb.cern.ch to remote directoy http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb
messages.1.gz:Aug 17 21:42:23 t2wn50 cvmfs2: switch proxy / retry on http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb/data/d4/91213099a78364431f09b64516b5e524ebbf59C
messages.1.gz:Aug 17 21:42:23 t2wn50 cvmfs2: switch proxy / retry on http://cvmfs-stratum-one.cern.ch/opt/lhcb/data/d4/91213099a78364431f09b64516b5e524ebbf59C
messages.1.gz:Aug 17 21:42:23 t2wn50 cvmfs2: unable to load catalog from /data/d4/91213099a78364431f09b64516b5e524ebbf59C, going to offline mode
messages.1.gz:Aug 17 21:42:23 t2wn50 cvmfs2: possible data corruption while trying to retrieve catalog from http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb/lib/lhcb/GAUSS, trying with no-cache
messages.1.gz:Aug 17 21:42:23 t2wn50 cvmfs2: switch proxy / retry on http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb/data/d4/91213099a78364431f09b64516b5e524ebbf59C
messages.1.gz:Aug 17 21:42:23 t2wn50 cvmfs2: switch proxy / retry on http://cvmfs-stratum-one.cern.ch/opt/lhcb/data/d4/91213099a78364431f09b64516b5e524ebbf59C
messages.1.gz:Aug 17 21:42:23 t2wn50 cvmfs2: unable to load catalog from /data/d4/91213099a78364431f09b64516b5e524ebbf59C, going to offline mode
messages.1.gz:Aug 17 21:42:23 t2wn50 cvmfs2: data corruption while trying to retrieve catalog from http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb/lib/lhcb/GAUSS
messages.1.gz:Aug 17 21:48:16 t2wn50 cvmfs2: CernVM-FS: unmounted /cvmfs/lhcb.cern.ch (http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb)
messages.1.gz:Aug 18 14:35:59 t2wn50 cvmfs2: switch proxy / retry on http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb/.cvmfspublished
messages.1.gz:Aug 18 14:35:59 t2wn50 cvmfs2: switch proxy / retry on http://cvmfs-stratum-one.cern.ch/opt/lhcb/.cvmfspublished
messages.1.gz:Aug 18 14:35:59 t2wn50 cvmfs2: unable to load checksum from /.cvmfspublished (7), going to offline mode
messages.1.gz:Aug 18 14:35:59 t2wn50 cvmfs2: catalog load failure while try to retrieve catalog from http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb
messages.1.gz:Aug 18 14:35:59 t2wn50 cvmfs2: CernVM-FS: linking /cvmfs/lhcb.cern.ch to remote directoy http://cernvmfs.gridpp.rl.ac.uk/opt/lhcb
Ewan
|