Hello
Sheffield was blacklisted by lhcb for production. I saw you all guys are
green for lhcb in new grid map.
I attached the plot which was sent to us by lhcb guy. The problem occurs
at the final stage when the job output should be copied from the worker
node to RAL.
As we are not failing LHCb SAM tests and small part of jobs finished
successfully, I don't think it's site configuration problem.
The error message from pilot:
2009-06-20 11:11:42 UTC dirac-jobexec.py INFO: SRM2Storage.__putFile:
Executing transfer of
file:/home/prdlhb90/globus-tmp.wn074.487.0/https_3a_2f_2fwms203.cern.ch_
3a9000_2fKN3KVWH941S4LscI-crZ-g/2858510/00004837_00279010_3.dst to
srm://srm-lhcb.gridpp.rl.ac.uk:8443/srm/managerv2?SFN=/castor/ads.rl.ac.
uk/prod/lhcb/MC/MC09/DST/00004837/0027/00004837_00279010_3.dst
2009-06-20 11:12:02 UTC dirac-jobexec.py ERROR: SRM2Storage.__putFile:
Failed to put file to storage. globus_xio: System error in writev:
Connection reset by peer
2009-06-20 11:12:02 UTC dirac-jobexec.py ERROR: globus_xio: A system
call failed: Connection reset by peer
2009-06-20 11:12:02 UTC dirac-jobexec.py ERROR:
ReplicaManager.putAndRegister: Failed to put file to Storage Element.
/home/prdlhb90/globus-tmp.wn074.487.0/https_3a_2f_2fwms203.cern.ch_3a900
0_2fKN3KVWH941S4LscI-crZ-g/2858510/00004837_00279010_3.dst:
SRM2Storage.__putFile: Failed to put file to storage.
2009-06-20 11:12:02 UTC dirac-jobexec.py/UploadOutputData VERB:
{'Message': 'ReplicaManager.putAndRegister: Failed to put file to
Storage Element. SRM2Storage.__putFile: Failed to put file to
storage.', 'OK': False}
Our network is not overloaded.
Any ideas what can be wrong are greatly appreciated.
Cheers
Elena
____________________________________________________________________________
Dr Elena Korolkova
Email: [log in to unmask]
Tel.: +44 (0)114 2223553
Fax: +44 (0)114 2223555
Department of Physics and Astronomy
University of Sheffield
Sheffield, S3 7RH, United Kingdom
|