Christopher J.Walker wrote: > QMUL seems to have problems with the CE-sft-lcg-rm-cr test > > The offending piece of the output seems to be: > > [SE][Mkdir] httpg://se03.esc.qmul.ac.uk:8444/srm/managerv2: CGSI-gSOAP: > Error reading token data header: Connection closed > > We seem to have significantly more failures with ce01 than with ce03 - > but both should send jobs to the same set of worker nodes. > > I've tried running the crl update by hand a few times - and still see > the problems - and have now set it to update every hour. > > What I don't understand is why there should be more failures on ce01 > than ce03 - they should both be using the same set of worker nodes. > > Any ideas? > And I presume this is what is causing our job failures in hammercloud: - pilotlog.txt - 18 Aug 2009 14:56:16| !!WARNING!!2990!! Command failed: export X509_USER_PROXY=/tmp/globus-tmp.cn466.8212.0; which lcg-cr; lcg-cr --version; lcg-cr --verbose --vo atlas -T srmv2 -s ATLASSCRATCHDISK -b -l /grid/atlas/users/pathena/user09.JohannesElmsheuser/user09.JohannesElmsheuser.ganga.sitetest.ANALY_QMUL.1250602342.702660fc-7717-4699-9f46-2e8a7ba9ee1a_sub02956913/user09.JohannesElmsheuser.ganga.sitetest.ANALY_QMUL.1250602342.702660fc-7717-4699-9f46-2e8a7ba9ee1a.AANT._00303.root -g 54c4d213-f40d-4c49-a2d4-ded159ac4b72 -d srm://se03.esc.qmul.ac.uk:8444/srm/managerv2?SFN=/atlas/atlasscratchdisk/user09.JohannesElmsheuser/user09.JohannesElmsheuser.ganga.sitetest.ANALY_QMUL.1250602342.702660fc-7717-4699-9f46-2e8a7ba9ee1a_sub02956913/user09.JohannesElmsheuser.ganga.sitetest.ANALY_QMUL.1250602342.702660fc-7717-4699-9f46-2e8a7ba9ee1a.AANT._00303.root file:/data/scratch/tmp/condorg_bAYM8327/pilot3/Panda_Pilot_8350_1250605330/PandaJob_1019068164_1250605330/user09.JohannesElmsheuser.ganga.sitetest.ANALY_QMUL.1250602342.702660fc-7717-4699-9f46-2e8a7ba9ee1a.AANT._00303.root 18 Aug 2009 14:56:16| !!WARNING!!5000!! Abnormal termination: ecode=256, ec=1, sig=-, len(etext)=1478 18 Aug 2009 14:56:16| !!WARNING!!5000!! Error message: /opt/grid/glite/3.1.19/lcg/bin/lcg-cr lcg_util-1.6.15 GFAL-client-1.10.17 Using grid catalog type: lfc Using grid catalog : lfc-atlas.gridpp.rl.ac.uk Using LFN : /grid/atlas/users/pathena/user09.JohannesElmsheuser/user09.JohannesElmsheuser.ganga.sitetest.ANALY_QMUL.1250602342.702660fc-7717-4699-9f46-2e8a7ba9ee1a_sub02956913/user09.JohannesElmsheuser.ganga.sitetest.ANALY_QMUL.1250602342.702660fc-7717-4699-9f46-2e8a7ba9ee1a.AANT._00303.root SE type: SRMv2 Using SURL : srm://se03.esc.qmul.ac.uk:8444/srm/managerv2?SFN=/atlas/atlasscratchdisk/user09.JohannesElmsheuser/user09.JohannesElmsheuser.ganga.sitetest.ANALY_QMUL.1250602342.702660fc-7717-4699-9f46-2e8a7ba9ee1a_sub02956913/user09.JohannesElmshe - Walltime - jobRetrival=1, StageIn=84, Execution=1867, StageOut=63, CleanUp=4 UHURA (37t) Chris