Print

Print


Hi Greig

I've just fixed it. As usual: broken ownership (root rather then
atlas) on the directory. It always happens at the moment when dcache
backing up pnfs  database which takes a lot of time (about 40 min) and
seems make it overloaded.
We are using script to  monitor this situation happens and to change
the ownership but it is not perfect. Hope to improve the situation
after finishing upgrade.

Regards

Sergey

On 31/03/2008, Greig Alan Cowan <[log in to unmask]> wrote:
> Hi Sergey,
>
>  What's happening at Manchester? You were passing Steve's tests for a
>  while, but seem to have started failing them again.
>
>  Cheers,
>
> Greig
>
>
>  On 27/03/08 16:36, Sergey wrote:
>  > Hi Greig
>  >
>  > On 27/03/2008, Greig Alan Cowan <[log in to unmask]> wrote:
>  >> Hi Sergey,
>  >>
>  >>  srmLs is only supported by SRMv2.2 servers. You are running dCache
>  >>  1.7.0-35, which only has SRMv1.
>  >
>  > Thanks, its clear
>  >
>  >>  For lcg-cp, which authorisation mechanims are you using in the dCache,
>  >>  gPlazma? Does the user DN appear in the grid-mapfile or vorole-mapfile?
>  >
>  > We using gPlazma and others:
>  >
>  > # Switches
>  > saml-vo-mapping="OFF"
>  > kpwd="ON"
>  > grid-mapfile="ON"
>  > gplazmalite-vorole-mapping="ON"
>  >
>  > # Priorities
>  > saml-vo-mapping-priority="4"
>  > kpwd-priority="1"
>  > grid-mapfile-priority="2"
>  > gplazmalite-vorole-mapping-priority="3"
>  >
>  > In /etc/grid-security/storage-authzdb:
>  > authorize ops001 read-write 9001 1000 / /
>  > authorize dteam001 read-write 30001 1005  / /
>  >
>  > In the grid-vorole
>  > "*" "/dteam" dteam001
>  > "*" "/ops" ops001
>  >
>  > But it successfully works with srmcp and lcg-cr  for dteam:
>  >
>  > =========================
>  > sergey@niels003:~$lcg-cr -v --vo dteam file:/home/sergey/h2.txt  -d
>  > dcache01.tier2.hep.manchester.ac.uk
>  > Using grid catalog type: lfc
>  > Using grid catalog : prod-lfc-shared-central.cern.ch
>  > Using LFN : /grid/dteam/generated/2008-03-27/file-71cc9e67-4357-4953-b1af-a184eca15e59
>  > Using SURL : srm://dcache01.tier2.hep.manchester.ac.uk/pnfs/tier2.hep.manchester.ac.uk/data/dteam/generated/2008-03-27/file8f9503b7-9ad9-4616-8f6c-8d4355d1a41d
>  > Source URL: file:/home/sergey/h2.txt
>  > File size: 1959
>  > VO name: dteam
>  > Destination specified: dcache01.tier2.hep.manchester.ac.uk
>  > Destination URL for copy:
>  > gsiftp://bohr3431.tier2.hep.manchester.ac.uk:2811//pnfs/tier2.hep.manchester.ac.uk/data/dteam/generated/2008-03-27/file8f9503b7-9ad9-4616-8f6c-8d4355d1a41d
>  > # streams: 1
>  > # set timeout to 0 seconds
>  > Alias registered in Catalog:
>  > lfn:/grid/dteam/generated/2008-03-27/file-71cc9e67-4357-4953-b1af-a184eca15e59
>  >          1959 bytes      2.47 KB/sec avg      2.47 KB/sec inst
>  > Transfer took 2080 ms
>  > Destination URL registered in Catalog:
>  > srm://dcache01.tier2.hep.manchester.ac.uk/pnfs/tier2.hep.manchester.ac.uk/data/dteam/generated/2008-03-27/file8f9503b7-9ad9-4616-8f6c-8d4355d1a41d
>  > guid:fece7962-e877-4d24-8efd-4e25b1bd89dd
>  > ======================================
>  >
>  > but not for "ops":
>  >
>  > + lcg-cr -v --vo ops file:/home/samops/.same/SRM/testFile.txt -l
>  > lfn:SRM-put-dcache01.tier2.hep.manchester.ac.uk-1206635107 -d
>  > dcache01.tier2.hep.manchester.ac.uk
>  > Using grid catalog type: lfc
>  > Using grid catalog : prod-lfc-shared-central.cern.ch
>  > Using LFN : /grid/ops/SAM/SRM-put-dcache01.tier2.hep.manchester.ac.uk-1206635107
>  > Using SURL : srm://dcache01.tier2.hep.manchester.ac.uk/pnfs/tier2.hep.manchester.ac.uk/data/ops/generated/2008-03-27/file2e3f9b08-96aa-4516-8955-b717718c0696
>  > httpg://dcache01.tier2.hep.manchester.ac.uk:8443/srm/managerv1:
>  > java.rmi.RemoteException: SRM Authorization failed; nested exception
>  > is:
>  >       org.dcache.srm.SRMAuthorizationException:
>  > diskCacheV111.services.authorization.AuthorizationServiceException:
>  > authR
>  > lcg_cr: Communication error on send
>  > + result=1
>  > + set +x
>  >
>  >
>  >>  On 27/03/08 14:22, Sergey wrote:
>  >>  > Hi
>  >>  > We have a problem with our dCache: can srmcp both
>  >>  > direction, though  can't srm list the same directory.
>  >>  > On the server side we have:
>  >>  > glite-SE_dcache-3.0.7-0
>  >>  > lcg-info-dynamic-dcache-1.0.9-1_sl3
>  >>  > dcache-client-1.7.0-35
>  >>  > dcache-server-1.7.0-35
>  >>  >
>  >>  > On UI: dcache-client-1.7.0-35
>  >>  > SRMcp read and write  running without problems. However when trying to
>  >>  > list the same directory we got:
>  >>  >
>  >>  > sergey@niels003:~$/opt/d-cache/srm/bin/srmls  -l -debug=true
>  >>  > srm://dcache01.tier2.hep.manchester.ac.uk:8443/pnfs/tier2.hep.manchester.ac.uk/data/dteam/test0325
>  >>  > WARNING: SRM_PATH is defined, which might cause a wrong version of srm
>  >>  > client to be executed
>  >>  > WARNING: SRM_PATH=/opt/d-cache/srm
>  >>  > Storage Resource Manager (SRM) CP Client version 1.23.1
>  >>  > Copyright (c) 2002-2006 Fermi National Accelerator Laboratory
>  >>  >
>  >>  > SRM Configuration:
>  >>  >        debug=true
>  >>  >        gsissl=true
>  >>  >        help=false
>  >>  >        pushmode=false
>  >>  >        userproxy=true
>  >>  >        buffer_size=131072
>  >>  >        tcp_buffer_size=0
>  >>  >        streams_num=10
>  >>  >        config_file=config.xml
>  >>  >        glue_mapfile=conf/SRMServerV1.map
>  >>  >        webservice_path=srm/managerv1
>  >>  >        webservice_protocol=https
>  >>  >        gsiftpclinet=globus-url-copy
>  >>  >        protocols_list=gsiftp,http
>  >>  >        save_config_file=null
>  >>  >        srmcphome=..
>  >>  >        urlcopy=sbin/urlcopy.sh
>  >>  >        x509_user_cert=/home/timur/k5-ca-proxy.pem
>  >>  >        x509_user_key=/home/timur/k5-ca-proxy.pem
>  >>  >        x509_user_proxy=/tmp/x509up_u508
>  >>  >        x509_user_trusted_certificates=/etc/grid-security/certificates
>  >>  >        globus_tcp_port_range=null
>  >>  >        gss_expected_name=null
>  >>  >        storagetype=permanent
>  >>  >        retry_num=20
>  >>  >        globus_tcp_port_range=null
>  >>  >        gss_expected_name=null
>  >>  >        storagetype=permanent
>  >>  >        retry_num=20
>  >>  >        retry_timeout=10000
>  >>  >        wsdl_url=null
>  >>  >        use_urlcopy_script=false
>  >>  >        connect_to_wsdl=false
>  >>  >        delegate=true
>  >>  >        full_delegation=true
>  >>  >        server_mode=passive
>  >>  >        srm_protocol_version=1
>  >>  >        request_lifetime=86400
>  >>  >        action is ls
>  >>  >        recursion depth=1
>  >>  >        is long listing mode=true
>  >>  >        surl[0]=srm://dcache01.tier2.hep.manchester.ac.uk:8443/pnfs/tier2.hep.manchester.ac.uk/data/dteam/test0325
>  >>  >        from=null
>  >>  >        to=null
>  >>  >
>  >>  > Tue Mar 25 15:56:01 GMT 2008: In SRMClient ExpectedName: host
>  >>  > Tue Mar 25 15:56:01 GMT 2008: SRMClient(https,srm/managerv1,true)
>  >>  > SRMClientV2 : user credentials are:
>  >>  > /C=UK/O=eScience/OU=Manchester/L=HEP/CN=sergey dolgobrodov
>  >>  > SRMClientV2 : connecting to srm at
>  >>  > httpg://dcache01.tier2.hep.manchester.ac.uk:8443/srm/managerv1
>  >>  > SRMClientV2 :  srmLs, contacting service
>  >>  > httpg://dcache01.tier2.hep.manchester.ac.uk:8443/srm/managerv1
>  >>  > SRMClientV2 : put: try # 0 failed with error
>  >>  > SRMClientV2 : org.xml.sax.SAXException: Deserializing parameter
>  >>  > 'srmLsRequest':  could not find deserializer for type
>  >>  > {http://srm.lbl.gov/StorageResourceManager}srmLsRequest
>  >>  > SRMClientV2 : put: try again
>  >>  > SRMClientV2 : sleeping for 10000 milliseconds before retrying
>  >>  > SRMClientV2 : put: try # 1 failed with error
>  >>  > SRMClientV2 : org.xml.sax.SAXException: Deserializing parameter
>  >>  > 'srmLsRequest':  could not find deserializer for type
>  >>  > {http://srm.lbl.gov/StorageResourceManager}srmLsRequest
>  >>  > SRMClientV2 : put: try again
>  >>  > SRMClientV2 : sleeping for 20000 milliseconds before retrying
>  >>  >
>  >>  > Yet another pain: lcg-cr/cp fails with authorisation problem like this:
>  >>  >
>  >>  >  > lcg-cr --vo ilc -d dcache01.tier2.hep.manchester.ac.uk -n 4 -t 6000
>  >>  > -v -l lfn:/grid/ilc/test/dcache01.tier2.hep.manchester.ac.uk_1206625776
>  >>  > file:/afs/desy.de/user/g/gellrich/grid/ilc/ses/TESTFILE0
>  >>  > httpg://dcache01.tier2.hep.manchester.ac.uk:8443/srm/managerv1:
>  >>  > java.rmi.RemoteException: SRM Authorization failed; nested exception
>  >>  > is:
>  >>  > org.dcache.srm.SRMAuthorizationException:
>  >>  > diskCacheV111.services.authorization.AuthorizationServiceException:
>  >>  > authR
>  >>  > Using grid catalog type: lfc
>  >>  > Using grid catalog : grid-lfc.desy.de
>  >>  > Using LFN : /grid/ilc/test/dcache01.tier2.hep.manchester.ac.uk_1206625776
>  >>  > Using SURL : srm://dcache01.tier2.hep.manchester.ac.uk/pnfs/tier2.hep.manchester.ac.uk/data/ilc/generated/2008-03-27/filefdf54c8b-2012-4372-a9b6-b6b6d66da5ca
>  >>  > Alias registered in Catalog:
>  >>  > lfn:/grid/ilc/test/dcache01.tier2.hep.manchester.ac.uk_1206625776
>  >>  >
>  >>  > However pure srmcp works fine.
>  >>  >
>  >>  > Can somebody help to diagnose the problem?
>  >>  >
>  >>
>  >
>  >
>


-- 
--
Sergey Dolgobrodov
Department of Physics & Astronomy
University of Manchester
Manchester M13  9PL
Tel: +44 (0)161 6608472
Mobile: +44 (0)790 4587534
Skype: sergeygd