JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for GRIDPP-STORAGE Archives


GRIDPP-STORAGE Archives

GRIDPP-STORAGE Archives


GRIDPP-STORAGE@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

GRIDPP-STORAGE Home

GRIDPP-STORAGE Home

GRIDPP-STORAGE  February 2008

GRIDPP-STORAGE February 2008

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: dcache 1.8.0, srm version mismatch and other animals

From:

Sergey <[log in to unmask]>

Reply-To:

Sergey <[log in to unmask]>

Date:

Tue, 19 Feb 2008 11:56:31 +0000

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (331 lines)

From Manchester:
....
SRMClientV2 : connecting to srm at
httpg://hepgrid5.ph.liv.ac.uk:8443/srm/managerv2
SRMClientV2 : srmPing , contacting service
httpg://hepgrid5.ph.liv.ac.uk:8443/srm/managerv2
Tue Feb 19 11:55:37 GMT 2008: received response
Tue Feb 19 11:55:37 GMT 2008: VersionInfo : v2.2
backend_type:dCache
backend_version:production-1-8-0-12p4

Sergey

On 19/02/2008, Greig Alan Cowan <[log in to unmask]> wrote:
> Hi John,
>
> It's definitely not working for me (see below). Certainly from your
> output it looks like it's working. As you say, all of the files look fine.
>
> I can ping the SRMv1 endpoint, it is only the v2.2 one that is complaining.
>
> Could someone else give this a go from outside Liverpool? You will need
> to use the latest dcache-srmclient rpm.
>
> Cheers,
> Greig
>
> $ opt/d-cache/srm/bin/srmping -2 -debug
> srm://hepgrid5.ph.liv.ac.uk:8443/srm/managerv2
> WARNING: SRM_PATH is defined, which might cause a wrong version of srm
> client to be executed
> WARNING: SRM_PATH=/home/gcowan/opt/d-cache/srm
> Storage Resource Manager (SRM) CP Client version 2.0
> Tue Feb 19 11:30:58 GMT 2008: In SRMClient ExpectedName: host
> Tue Feb 19 11:30:58 GMT 2008: SRMClient(https,srm/managerv2,true)
> SRMClientV2 : user credentials are:
> /C=UK/O=eScience/OU=Edinburgh/L=NeSC/CN=greig cowan
> SRMClientV2 : WEBSERVICE_PATH srm/managerv2
> SRMClientV2 : connecting to srm at
> httpg://hepgrid5.ph.liv.ac.uk:8443/srm/managerv2
> SRMClientV2 : srmPing , contacting service
> httpg://hepgrid5.ph.liv.ac.uk:8443/srm/managerv2
> SRMClientV2 : srmPing: try # 0 failed with error
> AxisFault
>   faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException
>   faultSubcode:
>   faultString: java.rmi.RemoteException: SRMServerV2.srmPing()
> exception; nested exception is:
>          java.lang.NoSuchMethodException:
> org.dcache.srm.v2_2.SrmPingResponse.setStatusCode(org.dcache.srm.v2_2.TStatusCode)
>   faultActor:
>   faultNode:
>   faultDetail:
>          {http://xml.apache.org/axis/}hostname:hepgrid5.ph.liv.ac.uk
>
> java.rmi.RemoteException: SRMServerV2.srmPing() exception; nested
> exception is:
>          java.lang.NoSuchMethodException:
> org.dcache.srm.v2_2.SrmPingResponse.setStatusCode(org.dcache.srm.v2_2.TStatusCode)
>          at
> org.apache.axis.message.SOAPFaultBuilder.createFault(SOAPFaultBuilder.java:222)
>
>
> On 19/02/08 11:29, John Bland wrote:
> > Hi,
> >
> >
> > First of all, when did you try to ping (dcache was still restarting when
> >  I sent the last email)? Secondly I can ping the srm2 and srm1 endpoints
> > from a liverpool machine:
> >
> > Tue Feb 19 11:25:13 GMT 2008: In SRMClient ExpectedName: host
> > Tue Feb 19 11:25:13 GMT 2008: SRMClient(https,srm/managerv2,true)
> > SRMClientV2 : user credentials are:
> > /C=UK/O=eScience/OU=Liverpool/L=CSD/CN=john bland
> > SRMClientV2 : WEBSERVICE_PATH srm/managerv2
> > SRMClientV2 : connecting to srm at
> > httpg://hepgrid5.ph.liv.ac.uk:8443/srm/managerv2
> > SRMClientV2 : srmPing , contacting service
> > httpg://hepgrid5.ph.liv.ac.uk:8443/srm/managerv2
> > Tue Feb 19 11:25:18 GMT 2008: received response
> > Tue Feb 19 11:25:18 GMT 2008: VersionInfo : v2.2
> > backend_type:dCache
> > backend_version:production-1-8-0-12p4
> >
> > Tue Feb 19 11:25:38 GMT 2008: In SRMClient ExpectedName: host
> > Tue Feb 19 11:25:38 GMT 2008: SRMClient(https,srm/managerv1,true)
> > SRMClientV1 : user credentials are:
> > /C=UK/O=eScience/OU=Liverpool/L=CSD/CN=john bland
> > SRMClientV1 : SRMClientV1 calling
> > org.globus.axis.util.Util.registerTransport()
> > SRMClientV1 : connecting to srm at
> > httpg://hepgrid5.ph.liv.ac.uk:8443/srm/managerv1
> > Tue Feb 19 11:25:40 GMT 2008: connected to server, obtaining proxy
> > Tue Feb 19 11:25:40 GMT 2008: got proxy of type class
> > org.dcache.srm.client.SRMClientV1
> > Tue Feb 19 11:25:40 GMT 2008:  srm ping returned = true
> >
> > Looks like the two endpoints are available to Liverpool addresses. Could
> > you try again, please?
> >
> > For reference I've diffed your files and our current setup, the
> > differences boil down to:
> >
> > srm_setup.env
> > =============
> >
> >> SRM_WEBAPP_DIR=${DCACHE_HOME}/libexec/apache-tomcat-5.5.20/webapps/srm
> > 16d18
> > < SRM_WEBAPP_DIR=${DCACHE_HOME}/srm-webapp
> >
> > dCacheSetup
> > ===========
> >
> > < #useGPlazmaAuthorizationModule=false
> > < useGPlazmaAuthorizationModule=true
> > < #useGPlazmaAuthorizationCell=true
> > < useGPlazmaAuthorizationCell=false
> > ---
> >> useGPlazmaAuthorizationModule=false
> >> useGPlazmaAuthorizationCell=true
> > 211c207
> > < # performanceMarkerPeriod=180
> > ---
> >> performanceMarkerPeriod=10
> > < # srmSpaceManagerEnabled=no
> > ---
> >> srmSpaceManagerEnabled=yes
> >
> > < # srmImplicitSpaceManagerEnabled=yes
> > ---
> >> srmImplicitSpaceManagerEnabled=yes
> >
> > < #parallelStreams=10
> > ---
> >> parallelStreams=1
> >
> > < srmCustomGetHostByAddr=true
> > ---
> >> # srmCustomGetHostByAddr=false
> >
> > < # SpaceManagerDefaultRetentionPolicy=CUSTODIAL
> > ---
> >> SpaceManagerDefaultRetentionPolicy=REPLICA
> > 667c658
> > < # SpaceManagerDefaultAccessLatency=NEARLINE
> > ---
> >> SpaceManagerDefaultAccessLatency=ONLINE
> > 672c663
> > < # SpaceManagerReserveSpaceForNonSRMTransfers=false
> > ---
> >> SpaceManagerReserveSpaceForNonSRMTransfers=true
> > < #billingToDb=no
> > ---
> >> billingToDb=yes
> >
> > srm.batch is identical.
> >
> > The only real difference I can see is that spacemanager isn't activated,
> > but this wasn't activated originally when you could ping our srm2.2
> > endpoint.
> >
> > Regards,
> >
> > John
> >
> > Greig Alan Cowan wrote:
> >> Hi John,
> >>
> >> Still not fixed. It appears that dCache thinks the srm/managerv2
> >> endpoint can only speak SRMv1. Can you compare your files with these:
> >>
> >> http://www.ph.ed.ac.uk/~gcowan1/srm.batch
> >> http://www.ph.ed.ac.uk/~gcowan1/dCacheSetup
> >> http://www.ph.ed.ac.uk/~gcowan1/srm_setup.env
> >>
> >> Thanks,
> >> Greig
> >>
> >>
> >> On 19/02/08 10:01, John Bland wrote:
> >>> Greig Alan Cowan wrote:
> >>>> Hi John,
> >>>>
> >>>> Things seem to be going well with the SAM tests, but I don't seem to be
> >>>> able to srmPing hepgrid5 on the SMR2.2 endpoint. Any ideas?
> >>> dCacheSetup still had srmVersion=1. I've set this to default (ie
> >>> commented it out) and restarted dcache. Hopefully that was the problem
> >>> and it won't break anything.
> >>>
> >>> John
> >>>
> >>>> Cheers,
> >>>> Greig
> >>>>
> >>>> On 18/02/08 15:45, John Bland wrote:
> >>>>> Hi,
> >>>>>
> >>>>> To follow myself up, we appear to have fixed the dual-homed problems
> >>>>> and can copy files in and out of the SE from internal and external
> >>>>> machines. Tests are starting to pass again (woohoo!).
> >>>>>
> >>>>> We're leaving it a while to see if any more problems were being masked
> >>>>> by anything we've fixed. If we're clean we'll probably push ahead with
> >>>>> migrating our pools and getting some space before attempting to break
> >>>>> it all again with the SRM2.2 spacemanager ;0).
> >>>>>
> >>>>> John
> >>>>>
> >>>>> John Bland wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> We are making progress, of sorts.
> >>>>>>
> >>>>>> We have fixed the GIIS problem (the static-file-Site.ldif hadn't been
> >>>>>> generated by yaim, usefully).
> >>>>>>
> >>>>>> I've also been naughty and set the srm1 endpoint as being srm_v1
> >>>>>> rather than SRM. Not sure which of the above fixed things as they
> >>>>>> were changed at the same time but then external SAM tests for SRM to
> >>>>>> hepgrid5 started passing.
> >>>>>>
> >>>>>> This didn't change the CE-* ops tests and Steve Lloyd analysis tests
> >>>>>> failing, which continued with the CGSI-gSOAP can't connect errors.
> >>>>>>
> >>>>>> We finally realised this morning that our new WN's were set to
> >>>>>> connect to the internal 192.168 interface on the SE, which had been
> >>>>>> disabled since then due to conflicts between the eth0 and eth1
> >>>>>> addresses causing dcache to fail.
> >>>>>>
> >>>>>> Adding the 192.168 address back to the SE stops the gSOAP errors but
> >>>>>> we still haven't fixed the underlying problem with dcache on
> >>>>>> dual-homed servers.
> >>>>>>
> >>>>>> We are trying to fix that as we don't want internal SE transfers
> >>>>>> battering our firewall/router all the time if possible but it is
> >>>>>> proving obstinate (par for the course it seems). We've set in
> >>>>>> dCacheSetup srmCustomGetHostByAddr=true and followed the instructions
> >>>>>> as on
> >>>>>> http://www.gridpp.ac.uk/wiki/DCache_FAQ#How_do_use_dCache_with_dual_homed_machines.3F
> >>>>>>
> >>>>>> but gridftp transfers just timeout after opening BINARY data
> >>>>>> connection (but eg edg-gridftp-ls does list as expected).
> >>>>>>
> >>>>>> John
> >>>>>>
> >>>>>> Greig Alan Cowan wrote:
> >>>>>>> Hi John,
> >>>>>>>
> >>>>>>> Is everything OK with your dCache? I don't seem to be able to
> >>>>>>> srmPing it.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Greig
> >>>>>>>
> >>>>>>> On 15/02/08 13:54, John Bland wrote:
> >>>>>>>> Greig Alan Cowan wrote:
> >>>>>>>>> Hi John,
> >>>>>>>>>> Really? ... Ah, if you're looking at steve lloyd's srm tests
> >>>>>>>>>> they're still failing for hepgrid5, but are passing for segrid1
> >>>>>>>>>> (which I fixed earlier today). Still see the gSOAP error for
> >>>>>>>>>> ops/steve lloyd analysis tests.
> >>>>>>>>> No, it's definitely working now:
> >>>>>>>>>
> >>>>>>>>> http://hepwww.ph.qmul.ac.uk/~lloyd/gridpp/setest/UKI-NORTHGRID-LIV-HEP_2.html
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> The SE tests have been passing since we came online and sorted out
> >>>>>>>> a dcache.kpwd file and permissions.
> >>>>>>>>
> >>>>>>>> What are failing are analysis jobs, such as
> >>>>>>>>
> >>>>>>>> http://hepwww.ph.qmul.ac.uk/~lloyd/gridpp/rbtest/UKI-NORTHGRID-LIV-HEP_MyAnalPackage_6.html
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> with the error
> >>>>>>>>
> >>>>>>>> httpg://hepgrid5.ph.liv.ac.uk:8443/srm/managerv1: CGSI-gSOAP: Could
> >>>>>>>> not open connection !
> >>>>>>>> lcg_cp: Communication error on send
> >>>>>>>> Error in <TFile::TFile>: file aod.pool.root does not exist
> >>>>>>>> Could not open the file "aod.pool.root"
> >>>>>>>> Warning in <TClass::TClass>: no dictionary for class IProxyDict is
> >>>>>>>> available
> >>>>>>>> WARNING: $POOL_CATALOG is not defined
> >>>>>>>> using default `xmlcatalog_file:PoolFileCatalog.xml'
> >>>>>>>>
> >>>>>>>>  *** Break *** segmentation violation
> >>>>>>>>
> >>>>>>>> or ops SAM Replica Management tests, such as on
> >>>>>>>>
> >>>>>>>> https://lcg-sam.cern.ch:8443/sam/sam.py?funct=ShowHistory&sensors=CE&vo=ops&nodename=hepgrid2.ph.liv.ac.uk
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> although I can't pick out any specific errors as the SAM site seems
> >>>>>>>> to be very stodgy today.
> >>>>>>>>
> >>>>>>>>>> I've done this but while some of the changes have shown up in the
> >>>>>>>>>> bdii there still isn't an /srm/managerv2 entry. I've attached our
> >>>>>>>>>> static-file-SE.ldif file.
> >>>>>>>>> What about the dSE.ldif file? You need to make sure that it
> >>>>>>>>> contains something like:
> >>>>>>>> [snip]
> >>>>>>>>
> >>>>>>>> I've updated that file as well and it's showing up managerv2 in our
> >>>>>>>> site BDII now, maybe that might help things.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>> John
> >>>>>>>>
> >>>
> >
> >
>


-- 
--
Sergey Dolgobrodov
Department of Physics & Astronomy
University of Manchester
Manchester M13  9PL
Tel: +44 (0)161 6608472
Mobile: +44 (0)790 4587534
Skype: sergeygd

Top of Message | Previous Page | Permalink

JISCMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004


WWW.JISCMAIL.AC.UK

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager