You should change the file, umount any pnfs's that are hanging about and
then restart pnfs.
What does the location manager and pnfsdomain log files say? Are they
still complaining?
Matt Doidge wrote:
> Sorry, I'm being unclear, I meant stopping the pnfsManager starting on
> both nodes when I did a dcache-core start -the cause of my orignal
> troubles. Pnfs (as in the service) is only running on the new pnfs
> node. Sorry for the confusion.
>
> I'll try recreating that file for the admin node and hopefully having
> it work somehow. Would it be advisable to restart stuff once it's
> done?
>
> cheers,
> Matt
>
> On 01/06/07, Greig Alan Cowan <[log in to unmask]> wrote:
>> Matt, you need pnfs to start on the pnfs node, not on the admin node
>> (you said "stop pnfs starting on both nodes").
>>
>> /pnfs /0/root/fs/usr/data 30 nooptions
>> /fs /0/root/fs 0 nooptions
>> /pnfsdoors /0/root/fs/usr/ 30 nooptions
>>
>> You shouldn't need the last line here. You should recreate the file by
>> echoing strings into it.
>>
>>
>> Matt Doidge wrote:
>> > After taking a tip from Owan and setting the NODE_TYPE to custom for
>> > my admin node as well as the pnfs node I've managed to stop pnfs
>> > starting on both nodes, and got it to start with just one mount:
>> > fal-pygrid-31.lancs.ac.uk:/pnfs on /pnfs/lancs.ac.uk type nfs
>> > (rw,intr,hard,addr=194.80.35.29)
>> >
>> > here's the output of the contents of the export file for our admin
>> > node from the pnfsnode:
>> > root@fal-pygrid-31 ~]# cat /pnfs/fs/admin/etc/exports/194.80.35.12
>> > /pnfs /0/root/fs/usr/data 30 nooptions
>> > /fs /0/root/fs 0 nooptions
>> > /pnfsdoors /0/root/fs/usr/ 30 nooptions
>> >
>> > and finally the output of the ls on the admin node;
>> > [root@fal-pygrid-20 root]# ls -l /pnfs
>> > total 5
>> > drwxr-xr-x 2 root root 4096 Jan 4 13:14 fs
>> > lrwxrwxrwx 1 root root 8 Jan 4 13:14 ftpBase ->
>> /pnfs/fs
>> > drwxr-xr-x 1 root root 512 Mar 13 14:16 lancs.ac.uk
>> >
>> >
>> > cheers,
>> > Matt
>> >
>> > On 01/06/07, Greig Alan Cowan <[log in to unmask]> wrote:
>> >> OK, there should only be one. What does df report?
>> >>
>> >> can you send the output of this:
>> >>
>> >> cat /pnfs/fs/admin/etc/exports/<IP of admin node>
>> >>
>> >> The first column of this file should be /pnfs or /pnfsdoors, whichever
>> >> it is, you should have it in the nfs exports line.
>> >>
>> >> Also, send the output of
>> >>
>> >> ls -l /pnfs
>> >>
>> >> on the admin node.
>> >>
>> >> Greig
>> >>
>> >> Matt Doidge wrote:
>> >> > The mounts on the admin node are looking like a mess, and there
>> are 3
>> >> > mounts from the pnfs node on our admin node:
>> >> >
>> >> > fal-pygrid-31.lancs.ac.uk:/pnfs on /pnfs/lancs.ac.uk type nfs
>> >> > (rw,intr,hard,addr=194.80.35.29)
>> >> > fal-pygrid-31.lancs.ac.uk:/pnfs on /pnfs/fs type nfs
>> >> > (rw,intr,hard,addr=194.80.35.29)
>> >> > fal-pygrid-31.lancs.ac.uk:/pnfsdoors on /pnfs/lancs.ac.uk type nfs
>> >> > (rw,intr,hard,addr=194.80.35.29
>> >> >
>> >> > Are any of them right? I don't like the looks of the mount on
>> >> > /pnfs/lancs.ac.uk.
>> >> >
>> >> > cheers,
>> >> > Matt
>> >> >
>> >> > On 01/06/07, Greig Alan Cowan <[log in to unmask]> wrote:
>> >> >> pnfs is definitely mounted on the SRM node?
>> >> >>
>> >> >> Did you happen to close off a firewall while you were reconfiguring
>> >> >> things?
>> >> >>
>> >> >>
>> >> >> Matt Doidge wrote:
>> >> >> > Greig wins my vote for King of the World once more, that got it,
>> >> >> > pnfsManager was trying to start on the admin node, simply
>> >> stopping it
>> >> >> > there and restarting it has solved some of my problems. I've got
>> >> >> > globus-url-copy's to work, but not srmcp's or dccp's. I get the
>> >> >> > errors:
>> >> >> >
>> >> >> >
>> >> >> > Fri Jun 01 18:01:14 BST 2007: starting SRMGetClient
>> >> >> > Fri Jun 01 18:01:14 BST 2007: In SRMClient ExpectedName: host
>> >> >> > Fri Jun 01 18:01:14 BST 2007: SRMClient(https,srm/managerv1,true)
>> >> >> > SRMClientV1 : user credentials are:
>> >> >> > /C=UK/O=eScience/OU=Lancaster/L=Physics/CN=matthew doidge
>> >> >> > SRMClientV1 : SRMClientV1 calling
>> >> >> > org.globus.axis.util.Util.registerTransport()
>> >> >> > SRMClientV1 : connecting to srm at
>> >> >> > httpg://fal-pygrid-20.lancs.ac.uk:8443/srm/managerv1
>> >> >> > Fri Jun 01 18:01:15 BST 2007: connected to server, obtaining
>> proxy
>> >> >> > Fri Jun 01 18:01:15 BST 2007: got proxy of type class
>> >> >> > org.dcache.srm.client.SRMClientV1
>> >> >> > SRMClientV1 : get:
>> >> >> >
>> >> >>
>> >>
>> surls[0]="srm://fal-pygrid-20.lancs.ac.uk:8443/pnfs/lancs.ac.uk/data/dteam/pooltest/fal23_test"
>>
>> >>
>> >> >>
>> >> >> >
>> >> >> > SRMClientV1 : get: protocols[0]="http"
>> >> >> > SRMClientV1 : get: protocols[1]="dcap"
>> >> >> > SRMClientV1 : get: protocols[2]="gsiftp"
>> >> >> > copy_jobs is empty
>> >> >> > SRMClientV1 : java.net.ConnectException: Connection refused
>> >> >> > SRMClientV1 : get : try # 0 failed with error
>> >> >> > SRMClientV1 : java.net.ConnectException: Connection refused
>> >> >> > copy_jobs is empty
>> >> >> > stopping copier
>> >> >> > srm copy of at least one file failed or not completed
>> >> >> >
>> >> >> > It's never just fixed is it! :-D
>> >> >> >
>> >> >> > cheers,
>> >> >> > Matt
>> >> >> >
>> >> >> > On 01/06/07, Greig Alan Cowan <[log in to unmask]> wrote:
>> >> >> >> Have you removed pnfs from the admin node? Presumably the
>> startup
>> >> >> script
>> >> >> >> is still in /etc/init.d .
>> >> >> >>
>> >> >> >> Greig
>> >> >> >>
>> >> >> >> Matt Doidge wrote:
>> >> >> >> > Well this guy shouldn't be on the admin node:
>> >> >> >> > [root@fal-pygrid-20 root]# ps aux|grep -i pnfs
>> >> >> >> > root 16046 0.0 0.0 4408 1308 pts/1 S 16:53 0:00
>> >> >> /bin/sh
>> >> >> >> > /opt/d-cache/jobs/pnfs start
>> >> >> >> > root 14455 0.0 0.0 3692 676 pts/1 S 17:38 0:00
>> >> >> grep -i
>> >> >> >> pnfs
>> >> >> >> >
>> >> >> >> > The pnfsDomain logs on the admin node contain complaints about
>> >> not
>> >> >> >> > beign able to find a mount point (as pnfs itself isn't
>> >> running), the
>> >> >> >> > logs on the pnfs node contain complaints seemingly about the
>> >> >> location
>> >> >> >> > manager:
>> >> >> >> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : runIO :
>> >> java.io.EOFException
>> >> >> >> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : java.io.EOFException
>> >> >> >> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : java.io.EOFException
>> >> >> >> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2498)
>>
>> >>
>> >> >>
>> >> >> >>
>> >> >> >> >
>> >> >> >> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
>> >> >> >> >
>> >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1273)
>> >> >> >> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
>> >> >> >> >
>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
>> >> >> >> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
>> >> >> >> >
>> >> >>
>> dmg.cells.network.LocationMgrTunnel.runIo(LocationMgrTunnel.java:283)
>> >> >> >> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> dmg.cells.network.LocationMgrTunnel.connectionThread(LocationMgrTunnel.java:202)
>>
>> >>
>> >> >>
>> >> >> >>
>> >> >> >> >
>> >> >> >> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
>> >> >> >> >
>> >> dmg.cells.network.LocationMgrTunnel.run(LocationMgrTunnel.java:347)
>> >> >> >> > 06/01 16:53:15 Cell(c-100@pnfsDomain) : at
>> >> >> >> > java.lang.Thread.run(Thread.java:595)
>> >> >> >> >
>> >> >> >> > Any way of stopping this starting when I hit dcache-core
>> >> start? The
>> >> >> >> > pnfsManager is switched to no for the admin node node_config?
>> >> >> >> >
>> >> >> >> > cheers,
>> >> >> >> > Matt
>> >> >> >> >
>> >> >> >> > On 01/06/07, Greig Alan Cowan <[log in to unmask]> wrote:
>> >> >> >> >> Matt,
>> >> >> >> >>
>> >> >> >> >> What does the PnfsDomain.log file say?
>> >> >> >> >>
>> >> >> >> >> Can you do a
>> >> >> >> >>
>> >> >> >> >> ps aux|grep -i pnfs
>> >> >> >> >>
>> >> >> >> >> on the admin node to make sure that no pnfs processes are
>> >> running.
>> >> >> >> >>
>> >> >> >> >> Cheers,
>> >> >> >> >> Greig
>> >> >> >> >>
>> >> >> >> >> Matt Doidge wrote:
>> >> >> >> >> > Heya,
>> >> >> >> >> >
>> >> >> >> >> > On the Pnfs Node:
>> >> >> >> >> > serviceLocatorHost=fal-pygrid-20.lancs.ac.uk
>> >> >> >> >> > serviceLocatorPort=11111
>> >> >> >> >> >
>> >> >> >> >> > On the Admin node:
>> >> >> >> >> > serviceLocatorHost=fal-pygrid-20.lancs.ac.uk
>> >> >> >> >> > serviceLocatorPort=11111
>> >> >> >> >> >
>> >> >> >> >> > so the same for both of them. It's also the same on all my
>> >> other
>> >> >> >> nodes.
>> >> >> >> >> >
>> >> >> >> >> > cheers,
>> >> >> >> >> > Matt
>> >> >> >> >> >
>> >> >> >> >> > On 01/06/07, Greig Alan Cowan <[log in to unmask]> wrote:
>> >> >> >> >> >> Matt,
>> >> >> >> >> >>
>> >> >> >> >> >> What are the entries:
>> >> >> >> >> >>
>> >> >> >> >> >> serviceLocatorHost
>> >> >> >> >> >> serviceLocatorPort
>> >> >> >> >> >>
>> >> >> >> >> >> set to on the admin and pnfs nodes? The host should be the
>> >> admin
>> >> >> >> node
>> >> >> >> >> >> hostname.
>> >> >> >> >> >>
>> >> >> >> >> >> Greig
>> >> >> >> >> >>
>> >> >> >> >> >> Matt Doidge wrote:
>> >> >> >> >> >> > I uncommented out that line, reran the install
>> scripts and
>> >> >> >> restarted
>> >> >> >> >> >> > stuff, but still no joy. Checking the pnfsDomain logs on
>> >> >> the Pnfs
>> >> >> >> >> node
>> >> >> >> >> >> > I see a lot of complaints that look's ike it can't
>> find the
>> >> >> >> location
>> >> >> >> >> >> > manager.
>> >> >> >> >> >> >
>> >> >> >> >> >> > I've increased some log verbosity to help find clues,
>> >> and am
>> >> >> >> looking
>> >> >> >> >> >> > for references to "localhost" in my dcache configs on
>> >> the Pnfs
>> >> >> >> node.
>> >> >> >> >> >> >
>> >> >> >> >> >> > cheers,
>> >> >> >> >> >> > Matt
>> >> >> >> >> >> >
>> >> >> >> >> >> > On 01/06/07, Greig Alan Cowan <[log in to unmask]> wrote:
>> >> >> >> >> >> >> Matt,
>> >> >> >> >> >> >>
>> >> >> >> >> >> >> Why is this line commented out in the
>> pnfs_node_config ?
>> >> >> >> >> >> >>
>> >> >> >> >> >> >> #ADMIN_NODE=fal-pygrid-20.lancs.ac.uk
>> >> >> >> >> >> >>
>> >> >> >> >> >> >> Greig
>> >> >> >> >> >> >>
>> >> >> >> >> >> >> Matt Doidge wrote:
>> >> >> >> >> >> >> > Here are the node_configs for both of the nodes,
>> >> >> >> >> >> >> >
>> >> >> >> >> >> >> > cheers,
>> >> >> >> >> >> >> > Matt
>> >> >> >> >> >> >> >
>> >> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>>
|