Greig Alan Cowan wrote:
> Hi all,
>
> I need to investigate this further. Not all dCache sites are failing the
> POSIX test at the moment, so it's not clear where the problem currently
> lies.
>
Thanks Grieg. It seems restarting gsidcap door
fixed the problem.
More info at storage mailing list.
Thanks,
Mona
> On 15/01/08 13:43, David Colling wrote:
>> Hi Mona,
>>
>> Are you saying that this is a problem that is causing failures at all
>> dcache sites? If so then this is quite a problem. Can anybody on the
>> TB-support list offer any advice?
>>
>> All the best,
>> david
>>
>> Mona Aggarwal wrote:
>>> Hi Stuart,
>>>
>>> I had informed the storage group about the
>>> problem, and it seems all dCache sites are
>>> getting it. It could be related to some
>>> certificate upgrade etc.
>>>
>>> Cheers,
>>> Mona
>>>
>>> Stuart Wakefield wrote:
>>>> This is quite annoying, till we fix it could we open standard
>>>> non-authenticated dcap (i agree this is non-ideal but its either this
>>>> or our site is broken as far as cms is concerned and we have
>>>> production to run)
>>>>
>>>> Cheers
>>>> Stuart
>>>>
>>>> On Jan 14, 2008 2:00 PM, Mona Aggarwal <[log in to unmask]>
>>>> wrote:
>>>>> Hi Stuart,
>>>>>
>>>>> I am aware of the problem, and looking
>>>>> into it.
>>>>>
>>>>> Last time, we had the similar problem and
>>>>> it was due to certificate format and was
>>>>> fixed after upgrading to the new dcache
>>>>> release.
>>>>>
>>>>> Cheers,
>>>>> Mona
>>>>>
>>>>>
>>>>> Stuart Wakefield wrote:
>>>>>> Hi
>>>>>>
>>>>>> My jobs are still dieing with this can someone check (also Matts
>>>>>> problem..)
>>>>>>
>>>>>> dccp
>>>>>> gsidcap://gfe02.hep.ph.ic.ac.uk:22128/pnfs/hep.ph.ic.ac.uk/data/cms/store/unmerged/test/2007/11/15/CSA07-ProdMgrTestLCG6_EWK_Zmumu_2-3582/GEN-SIM/0000/469D5F7E-5EC0-DC11-AFAB-003048898D90.root
>>>>>>
>>>>>> .
>>>>>> Dcap Version version-1-2-41 Oct 16 2006 16:09:04
>>>>>> Allocated message queues 0, used 0
>>>>>>
>>>>>> Allocated message queues 1, used 1
>>>>>>
>>>>>> Creating a new control connection to gfe02.hep.ph.ic.ac.uk:22128.
>>>>>> Activating IO tunnel. Provider: [libgsiTunnel.so].
>>>>>> Added IO tunneling plugin libgsiTunnel.so for
>>>>>> gfe02.hep.ph.ic.ac.uk:22128.
>>>>>> Sending control message: 0 0 client hello 0 0 2 41 -uid=30078
>>>>>> -pid=10885 -gid=6747
>>>>>> Error ( POLLIN) (with data) on control line [3]
>>>>>> Removing [3] form control lines list
>>>>>> Failed to connect to gfe02.hep.ph.ic.ac.uk:22128
>>>>>> Failed to create a control line
>>>>>> [-1] unpluging node
>>>>>> Removing unneeded queue [1]
>>>>>> [-1] destroing node
>>>>>> Using system native stat64 for ..
>>>>>> Allocated message queues 2, used 1
>>>>>>
>>>>>> Allocated message queues 2, used 2
>>>>>>
>>>>>> Creating a new control connection to gfe02.hep.ph.ic.ac.uk:22128.
>>>>>> Activating IO tunnel. Provider: [libgsiTunnel.so].
>>>>>> Added IO tunneling plugin libgsiTunnel.so for
>>>>>> gfe02.hep.ph.ic.ac.uk:22128.
>>>>>> Sending control message: 0 0 client hello 0 0 2 41 -uid=30078
>>>>>> -pid=10885 -gid=6747
>>>>>> Error ( POLLIN POLLERR POLLHUP) (with data) on control line [3]
>>>>>> Removing [3] form control lines list
>>>>>> Failed to connect to gfe02.hep.ph.ic.ac.uk:22128
>>>>>> Failed to create a control line
>>>>>> [-1] unpluging node
>>>>>> Removing unneeded queue [2]
>>>>>> [-1] destroing node
>>>>>> Failed open file in the dCache.
>>>>>> Can't open source file : Server rejected "hello"
>>>>>> System error: Input/output error
>>>>>> -bash-3.00$ voms-proxy-init -voms cms
>>>>>> Your identity: /C=UK/O=eScience/OU=Imperial/L=Physics/CN=stuart
>>>>>> wakefield
>>>>>> Enter GRID pass phrase:
>>>>>> -bash-3.00$ voms-proxy-info -all
>>>>>> subject : /C=UK/O=eScience/OU=Imperial/L=Physics/CN=stuart
>>>>>> wakefield/CN=proxy
>>>>>> issuer : /C=UK/O=eScience/OU=Imperial/L=Physics/CN=stuart
>>>>>> wakefield
>>>>>> identity : /C=UK/O=eScience/OU=Imperial/L=Physics/CN=stuart
>>>>>> wakefield
>>>>>> type : proxy
>>>>>> strength : 512 bits
>>>>>> path : /tmp/x509up_u30078
>>>>>> timeleft : 11:59:50
>>>>>> === VO cms extension information ===
>>>>>> VO : cms
>>>>>> subject : /C=UK/O=eScience/OU=Imperial/L=Physics/CN=stuart
>>>>>> wakefield
>>>>>> issuer : /DC=ch/DC=cern/OU=computers/CN=voms.cern.ch
>>>>>> attribute : /cms/Role=NULL/Capability=NULL
>>>>>> attribute : /cms/analysis/Role=NULL/Capability=NULL
>>>>>> attribute : /cms/Higgs/Role=NULL/Capability=NULL
>>>>>> timeleft : 11:59:50
>>>>>>
>>>>>> Cheers
>>>>>> Stuart
>>>>>>
>>>>>> On Jan 12, 2008 12:56 PM, Stuart Wakefield
>>>>>> <[log in to unmask]> wrote:
>>>>>>> Hi
>>>>>>>
>>>>>>> My jobs are now failing to access files via dccp but srm seems
>>>>>>> fine..
>>>>>>>
>>>>>>> [stuartw@gfe03 stuartw]$ dccp
>>>>>>> dcap://gfe02.hep.ph.ic.ac.uk:22128/pnfs/hep.ph.ic.ac.uk/data/cms/store/unmerged/test/2007/11/15/CSA07-ProdMgrTestLCG6_EWK_Zmumu_2-3582/GEN-SIM/0000/469D5F7E-5EC0-DC11-AFAB-003048898D90.root
>>>>>>>
>>>>>>> .
>>>>>>> Error ( POLLIN) (with data) on control line [3]
>>>>>>> Failed to create a control line
>>>>>>> Error ( POLLIN POLLERR POLLHUP) (with data) on control line [3]
>>>>>>> Failed to create a control line
>>>>>>> Failed open file in the dCache.
>>>>>>> Can't open source file : Server rejected "hello"
>>>>>>> System error: Input/output error
>>>>>>>
>>>>>>> Cheers
>>>>>>> Stuart
>>>>>>>
>>>>>>> On Jan 11, 2008 7:06 PM, Wingham, Matthew P
>>>>>>>
>>>>>>> <[log in to unmask]> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Mona,
>>>>>>>>
>>>>>>>> Unfortunately I still get the same.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Mona Aggarwal [mailto:[log in to unmask]]
>>>>>>>> Sent: Fri 1/11/2008 6:08 PM
>>>>>>>> To: Wakefield, Stuart L
>>>>>>>> Cc: DGUSER; Wingham, Matthew P
>>>>>>>> Subject: Re: Fwd: [Hep-cms-computing] dcache from gfe03
>>>>>>>>
>>>>>>>> Stuart Wakefield wrote:
>>>>>>>>> ---------- Forwarded message ----------
>>>>>>>>> From: Wingham, Matthew P <[log in to unmask]>
>>>>>>>>> Date: Jan 11, 2008 5:19 PM
>>>>>>>>> Subject: [Hep-cms-computing] dcache from gfe03
>>>>>>>>> To: hep-cms-computing <[log in to unmask]>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Im trying to remove some old files on dcache from gfe03. As
>>>>>>>>> per the
>>>>>>>>> twiki I try :
>>>>>>>>>
>>>>>>>>> bash-2.05b$ uberftp cmsdsk00
>>>>>>>>> 220 GSI FTP Door ready
>>>>>>>>> 530 Authorization Service failed:
>>>>>>>>> diskCacheV111.services.authorization.AuthorizationServiceException:
>>>>>>>>>
>>>>>>>>> authRequestID 1056800682 delegation failed for authentification of
>>>>>>>>> /C=UK/O=eScience/OU=Imperial/L=Physics/CN=matthew wingham
>>>>>>>>> java.net.SocketException: Connection reset
>>>>>>>>>
>>>>>>>>> Or if i try :
>>>>>>>>>
>>>>>>>>> bash-2.05b$ srm-advisory-delete
>>>>>>>>>
>>>>>>>> srm://gfe02.hep.ph.ic.ac.uk:8443/pnfs/hep.ph.ic.ac.uk/data/cms/local/users/pwing/tt-ee_1.root
>>>>>>>>
>>>>>>>>> WARNING: SRM_PATH is defined, which might cause a wrong
>>>>>>>>> version of
>>>>>>>>> srm client to be executed
>>>>>>>>> WARNING: SRM_PATH=/opt/d-cache/srm
>>>>>>>>> srm client error: ; nested exception is:
>>>>>>>>> java.net.SocketException: Connection reset
>>>>>>>>>
>>>>>>>>> This is after voms-proxy-init....
>>>>>>>>>
>>>>>>>>> Is there a problem with my certificate? Ive recently
>>>>>>>>> re-registered
>>>>>>>>> with the VO. Though it seems to work properly elsewhere....
>>>>>>>> I have recently updated voms certificate, could you pls try again?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Mona
>>>>>>>>
>>>>>>>> --
>>>>>>>> Mona Aggarwal- Imperial College
>>>>>>>> Tel: +442075947809
>>>>>>>> Email: [log in to unmask]
>>>>>
>>>>> --
>>>>>
>>>>> Mona Aggarwal- Imperial College
>>>>> Tel: +442075947809
>>>>> Email: [log in to unmask]
>>>>>
>>>
>>>
|