Heya guys, and Happy 2009 to all,
We're regularly failing srm SAM tests at ~6.13 and ~18.13 every day with
the error message pasted below. Such regular failing sets off the
obvious alarm bells, and I immediately checked the cron jobs. Both the
edg-mkgridmap and our mysql backup happen at the time of these failures,
but as these are 6 hourly cronjobs I would also expect them to interfere
with the midnight and midday tests. Also the error message doesn't quite
fit with what I'd expect (last time we saw a similar error message it
was caused by network problems between the worker nodes/CE and the SE).
I'd appreciate any wisdom on this matter.
cheers,
Matt
+ lcg-cr --version
lcg_util-1.6.15
GFAL-client-1.10.17
+ set +x
+ lcg-cr -t 120 -v --vo ops file:/home/samops/.same/SE/testFile.txt -l lfn:SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507 -d fal-pygrid-30.lancs.ac.uk
Using grid catalog type: lfc
Using grid catalog : prod-lfc-shared-central.cern.ch
Using LFN : /grid/ops/SAM/SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507
[BDII] sam-bdii.cern.ch:2170: Warning, no GlueVOInfo information found about tag '(null)' and SE 'fal-pygrid-30.lancs.ac.uk'
SE type: SRMv1
Using SURL : srm://fal-pygrid-30.lancs.ac.uk/dpm/lancs.ac.uk/home/ops/generated/2009-01-09/file33df0c61-861c-4f81-9efa-3c6999a6d6d1
Alias registered in Catalog: lfn:/grid/ops/SAM/SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507
Alias registered in Catalog: lfn:/grid/ops/SAM/SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507
Alias registered in Catalog: lfn:/grid/ops/SAM/SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507
Alias registered in Catalog: lfn:/grid/ops/SAM/SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507
Alias registered in Catalog: lfn:/grid/ops/SAM/SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507
[SE][put] httpg://fal-pygrid-30.lancs.ac.uk:8443/srm/managerv1: CGSI-gSOAP: Error reading token data header: Connection closed
lcg_cr: Operation now in progress
+ result=1
+ set +x
|