Greig wins my vote for King of the World once more, that got it,
pnfsManager was trying to start on the admin node, simply stopping it
there and restarting it has solved some of my problems. I've got
globus-url-copy's to work, but not srmcp's or dccp's. I get the
errors:
Fri Jun 01 18:01:14 BST 2007: starting SRMGetClient
Fri Jun 01 18:01:14 BST 2007: In SRMClient ExpectedName: host
Fri Jun 01 18:01:14 BST 2007: SRMClient(https,srm/managerv1,true)
SRMClientV1 : user credentials are:
/C=UK/O=eScience/OU=Lancaster/L=Physics/CN=matthew doidge
SRMClientV1 : SRMClientV1 calling org.globus.axis.util.Util.registerTransport()
SRMClientV1 : connecting to srm at
httpg://fal-pygrid-20.lancs.ac.uk:8443/srm/managerv1
Fri Jun 01 18:01:15 BST 2007: connected to server, obtaining proxy
Fri Jun 01 18:01:15 BST 2007: got proxy of type class
org.dcache.srm.client.SRMClientV1
SRMClientV1 : get:
surls[0]="srm://fal-pygrid-20.lancs.ac.uk:8443/pnfs/lancs.ac.uk/data/dteam/pooltest/fal23_test"
SRMClientV1 : get: protocols[0]="http"
SRMClientV1 : get: protocols[1]="dcap"
SRMClientV1 : get: protocols[2]="gsiftp"
copy_jobs is empty
SRMClientV1 : java.net.ConnectException: Connection refused
SRMClientV1 : get : try # 0 failed with error
SRMClientV1 : java.net.ConnectException: Connection refused
copy_jobs is empty
stopping copier
srm copy of at least one file failed or not completed
It's never just fixed is it! :-D
cheers,
Matt
On 01/06/07, Greig Alan Cowan <[log in to unmask]> wrote:
> Have you removed pnfs from the admin node? Presumably the startup script
> is still in /etc/init.d .
>
> Greig
>
> Matt Doidge wrote:
> > Well this guy shouldn't be on the admin node:
> > [root@fal-pygrid-20 root]# ps aux|grep -i pnfs
> > root 16046 0.0 0.0 4408 1308 pts/1 S 16:53 0:00 /bin/sh
> > /opt/d-cache/jobs/pnfs start
> > root 14455 0.0 0.0 3692 676 pts/1 S 17:38 0:00 grep -i
> pnfs
> >
> > The pnfsDomain logs on the admin node contain complaints about not
> > beign able to find a mount point (as pnfs itself isn't running), the
> > logs on the pnfs node contain complaints seemingly about the location
> > manager:
> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : runIO : java.io.EOFException
> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : java.io.EOFException
> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : java.io.EOFException
> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
> >
> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2498)
> >
> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
> > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1273)
> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
> > java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
> > dmg.cells.network.LocationMgrTunnel.runIo(LocationMgrTunnel.java:283)
> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
> >
> dmg.cells.network.LocationMgrTunnel.connectionThread(LocationMgrTunnel.java:202)
> >
> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
> > dmg.cells.network.LocationMgrTunnel.run(LocationMgrTunnel.java:347)
> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
> > java.lang.Thread.run(Thread.java:595)
> >
> > Any way of stopping this starting when I hit dcache-core start? The
> > pnfsManager is switched to no for the admin node node_config?
> >
> > cheers,
> > Matt
> >
> > On 01/06/07, Greig Alan Cowan <[log in to unmask]> wrote:
> >> Matt,
> >>
> >> What does the PnfsDomain.log file say?
> >>
> >> Can you do a
> >>
> >> ps aux|grep -i pnfs
> >>
> >> on the admin node to make sure that no pnfs processes are running.
> >>
> >> Cheers,
> >> Greig
> >>
> >> Matt Doidge wrote:
> >> > Heya,
> >> >
> >> > On the Pnfs Node:
> >> > serviceLocatorHost=fal-pygrid-20.lancs.ac.uk
> >> > serviceLocatorPort=11111
> >> >
> >> > On the Admin node:
> >> > serviceLocatorHost=fal-pygrid-20.lancs.ac.uk
> >> > serviceLocatorPort=11111
> >> >
> >> > so the same for both of them. It's also the same on all my other nodes.
> >> >
> >> > cheers,
> >> > Matt
> >> >
> >> > On 01/06/07, Greig Alan Cowan <[log in to unmask]> wrote:
> >> >> Matt,
> >> >>
> >> >> What are the entries:
> >> >>
> >> >> serviceLocatorHost
> >> >> serviceLocatorPort
> >> >>
> >> >> set to on the admin and pnfs nodes? The host should be the admin node
> >> >> hostname.
> >> >>
> >> >> Greig
> >> >>
> >> >> Matt Doidge wrote:
> >> >> > I uncommented out that line, reran the install scripts and restarted
> >> >> > stuff, but still no joy. Checking the pnfsDomain logs on the Pnfs
> >> node
> >> >> > I see a lot of complaints that look's ike it can't find the location
> >> >> > manager.
> >> >> >
> >> >> > I've increased some log verbosity to help find clues, and am looking
> >> >> > for references to "localhost" in my dcache configs on the Pnfs node.
> >> >> >
> >> >> > cheers,
> >> >> > Matt
> >> >> >
> >> >> > On 01/06/07, Greig Alan Cowan <[log in to unmask]> wrote:
> >> >> >> Matt,
> >> >> >>
> >> >> >> Why is this line commented out in the pnfs_node_config ?
> >> >> >>
> >> >> >> #ADMIN_NODE=fal-pygrid-20.lancs.ac.uk
> >> >> >>
> >> >> >> Greig
> >> >> >>
> >> >> >> Matt Doidge wrote:
> >> >> >> > Here are the node_configs for both of the nodes,
> >> >> >> >
> >> >> >> > cheers,
> >> >> >> > Matt
> >> >> >> >
> >> >> >>
> >> >>
> >>
>
|