Hi Alessandra
thanks for reply. We hape the latest version of glite on WNs. We still
need to upgrade DPM but this is not related to LHCb stuff as they do not
use out storage.
I don't know what is wrong.
If you could tell me what do you think could be obsolete please let me
know and also what vesion are you using at your sites which are fine for
LHCb.
I can't easily check how many LHCb jobs were successful, neither I can
check jobs outputs as I do thsi for
atlas. This makes the task of finding the error more complicated.
Cheers
Elena
On Tue, 23 Jun 2009, Alessandra Forti wrote:
> Hi Elena,
>
> Raja reported the problem also at the dteam meeting and forwarded me an email
> Vladimir wrote you. I still have to look into it. From the top of my head,
> since you are the only site that fails, it might be some software version
> problem at the site.
>
> cheers
> alessandra
>
> Elena Korolkova wrote:
>>
>> Hello
>>
>> Sheffield was blacklisted by lhcb for production. I saw you all guys are
>> green for lhcb in new grid map.
>>
>> I attached the plot which was sent to us by lhcb guy. The problem occurs at
>> the final stage when the job output should be copied from the worker node
>> to RAL.
>>
>> As we are not failing LHCb SAM tests and small part of jobs finished
>> successfully, I don't think it's site configuration problem.
>>
>> The error message from pilot:
>>
>> 2009-06-20 11:11:42 UTC dirac-jobexec.py INFO: SRM2Storage.__putFile:
>> Executing transfer of
>> file:/home/prdlhb90/globus-tmp.wn074.487.0/https_3a_2f_2fwms203.cern.ch_
>> 3a9000_2fKN3KVWH941S4LscI-crZ-g/2858510/00004837_00279010_3.dst to
>> srm://srm-lhcb.gridpp.rl.ac.uk:8443/srm/managerv2?SFN=/castor/ads.rl.ac.
>> uk/prod/lhcb/MC/MC09/DST/00004837/0027/00004837_00279010_3.dst
>> 2009-06-20 11:12:02 UTC dirac-jobexec.py ERROR: SRM2Storage.__putFile:
>> Failed to put file to storage. globus_xio: System error in writev:
>> Connection reset by peer
>> 2009-06-20 11:12:02 UTC dirac-jobexec.py ERROR: globus_xio: A system
>> call failed: Connection reset by peer
>> 2009-06-20 11:12:02 UTC dirac-jobexec.py ERROR:
>> ReplicaManager.putAndRegister: Failed to put file to Storage Element.
>> /home/prdlhb90/globus-tmp.wn074.487.0/https_3a_2f_2fwms203.cern.ch_3a900
>> 0_2fKN3KVWH941S4LscI-crZ-g/2858510/00004837_00279010_3.dst:
>> SRM2Storage.__putFile: Failed to put file to storage.
>> 2009-06-20 11:12:02 UTC dirac-jobexec.py/UploadOutputData VERB:
>> {'Message': 'ReplicaManager.putAndRegister: Failed to put file to
>> Storage Element. SRM2Storage.__putFile: Failed to put file to
>> storage.', 'OK': False}
>>
>> Our network is not overloaded.
>>
>> Any ideas what can be wrong are greatly appreciated.
>>
>> Cheers
>> Elena
>>
>> ____________________________________________________________________________
>> Dr Elena Korolkova
>> Email: [log in to unmask]
>> Tel.: +44 (0)114 2223553
>> Fax: +44 (0)114 2223555
>> Department of Physics and Astronomy
>> University of Sheffield
>> Sheffield, S3 7RH, United Kingdom
>>
>> ------------------------------------------------------------------------
>>
>
____________________________________________________________________________
Dr Elena Korolkova
Email: [log in to unmask]
Tel.: +44 (0)114 2223553
Fax: +44 (0)114 2223555
Department of Physics and Astronomy
University of Sheffield
Sheffield, S3 7RH, United Kingdom
|