JISCMail - GRIDPP-STORAGE Archives

Email discussion lists for the UK Education and Research communities
Subscriber's Corner
Email Lists
GRIDPP-STORAGE Archives

GRIDPP-STORAGE@JISCMAIL.AC.UK

View:

Message:
[
First
Last
]
By Topic:
[
First
Last
]
By Author:
[
First
Last
]
Font:
Proportional Font
		LISTSERV Archives
		GRIDPP-STORAGE Home
		GRIDPP-STORAGE June 2008
Options

Subscribe or Unsubscribe
Get Password
Subject:
Re: dCache upgrade with remote pnfs installation
From:
"Davies, BGE (Brian)" <[log in to unmask]>
Reply-To:
Davies, BGE (Brian)
Date:
Thu, 12 Jun 2008 15:14:01 +0100
Content-Type:
text/plain
Parts/Attachments:
text/plain (424 lines)
Have you made sure the pnfs is up and running on the other node before
starting the main head node? 

-----Original Message-----
From: Sergey [mailto:[log in to unmask]] 
Sent: 12 June 2008 15:04
To: Gerd Behrmann
Cc: [log in to unmask]; [log in to unmask];
[log in to unmask]; Synge, Owen
Subject: Re: dCache upgrade with remote pnfs installation

OK, thanks

We  send a formal request to [log in to unmask] any way to raise the
problem.
Hope Owen will pick up all the information from this thread.

Sergey

2008/6/12 Gerd Behrmann <[log in to unmask]>:
> The pnfsDomain was started because NODE_TYPE was not correctly set. I 
> have no idea why the PNFS is still mounted. I have put Owen on cc, 
> since he maintains the install script.
>
> Cheers,
>
> /gerd
>
> Alessandra Forti wrote:
>>
>> Hi Gerd,
>>
>> maybe I'm being naive but if pnfs and pnfsDomain have been moved to 
>> another machine and the flags are correctly set to 'no' in
node_config.
>> dcache shouldn't try to mount the file system nor to start the
pnfsDomain.
>> There aren't even gridftp doors on that node which apparently require
it.
>>
>> thanks for the suggestion though.
>>
>> cheers
>> alessandra
>>
>> Gerd Behrmann wrote:
>>>
>>> Have you tried cleaning out the /pnfs mount points on the head node.
I.e.
>>> while PNFS is *not* mounted on the head node, remove the directory 
>>> entries under /pnfs. Then rerun the installation.
>>>
>>> I am no expert in the different PNFS mount points, but it seems the 
>>> install script does like whatever was leftover from before you moved

>>> PNFS to another machine.
>>>
>>> Just be careful not to accidentally delete something actually stored

>>> in PNFS :-)
>>>
>>> Cheers,
>>>
>>> /gerd
>>>
>>> Sergey wrote:
>>>>
>>>> Hi Gerd,
>>>>
>>>> I have changed that NODE_TYPE to custom.
>>>> Now the installation script still doesn't like it:
>>>>
>>>>  [root@dcache01 ~]# /opt/d-cache/install/install.sh INFO:Skipping 
>>>> ssh key generation
>>>>
>>>>  Checking MasterSetup  ./config/dCacheSetup O.k.
>>>>
>>>>   Sanning dCache batch files
>>>>
>>>>    Processing adminDoor
>>>>    Processing chimera
>>>>    Processing dCache
>>>>    Processing dir
>>>>    Processing door
>>>>    Processing gPlazma
>>>>    Processing gridftpdoor
>>>>    Processing gsidcapdoor
>>>>    Processing httpd
>>>>    Processing infoProvider
>>>>    Processing lm
>>>>    Processing pnfs
>>>>    Processing pool
>>>>    Processing replica
>>>>    Processing srm
>>>>    Processing statistics
>>>>    Processing utility
>>>>    Processing xrootdDoor
>>>>
>>>>
>>>>  Checking Users database .... Ok
>>>>  Checking Security       .... Ok
>>>>  Checking JVM ........ Ok
>>>>  Checking Cells ...... Ok
>>>>  dCacheVersion ....... Version production-1-8-0-15p5
>>>>
>>>> INFO:Will be mounted to dcache01.tier2.hep.manchester.ac.uk:/fs by 
>>>> dcache-core start-up script.
>>>> INFO:Creating link /pnfs/tier2.hep.manchester.ac.uk --> 
>>>> /pnfs/fs/usr/ MADE THE SYMBOLIC LINK INFO:Link 
>>>> /pnfs/tier2.hep.manchester.ac.uk --> /pnfs/fs/usr already there.
>>>> INFO:[INFO] Creating link /pnfs/ftpBase --> /pnfs/fs which is used 
>>>> by the GridFTP door.
>>>> INFO:PNFS is not running. It is needed to prepare dCache. ...
>>>> ERROR:Not allowed to start it. Set PNFS_START in etc/node_config to

>>>> 'yes' or start by hand. Exiting.
>>>>
>>>> Sergey
>>>>
>>>> 2008/6/12 Gerd Behrmann <[log in to unmask]>:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Notice that you should set NODE_TYPE to custom - otherwise most of

>>>>> the flags in node_config are not respected.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> /gerd
>>>>>
>>>>> Sergey wrote:
>>>>>>
>>>>>> Hi
>>>>>>
>>>>>> Have to add to our previous email (below) that pnfsDomain always 
>>>>>> started on the head node after /opt/d-cache/bin/dcache 
>>>>>> start/restart though the node was configured in the node_config 
>>>>>> to run with a remote pnfs and not start pnfs on this one. Here is

>>>>>> our node_config on the head node:
>>>>>>
>>>>>> # $Id: node_config.template,v 1.6 2007-06-19 10:04:10 tigran Exp 
>>>>>> $ #
>>>>>> NODE_TYPE=admin                 #admin, pool, door or custom
>>>>>> DCACHE_HOME=/opt/d-cache
>>>>>> POOL_PATH=/opt/d-cache/etc
>>>>>> NUMBER_OF_MOVERS=100
>>>>>>
>>>>>> #
>>>>>> # which namespace in installed?
>>>>>> #
>>>>>> # Possible values is chimera or pnfs # if nothing is defined or 
>>>>>> none of above, then pnfs is used # NAMESPACE=''
>>>>>> PNFS_ROOT=/pnfs
>>>>>> PNFS_INSTALL_DIR=/opt/pnfs
>>>>>> #PNFS_START=yes
>>>>>> PNFS_START=no
>>>>>> PNFS_OVERWRITE=no
>>>>>>
>>>>>> # SERVER_ID=domain.name         # defaults to `hostname -d`
>>>>>> SERVER_ID=tier2.hep.manchester.ac.uk
>>>>>> #ADMIN_NODE=myAdminNode          # only needed for GridFTP door
which
>>>>>> is not on the admin node
>>>>>> ADMIN_NODE=dcache01.tier2.hep.manchester.ac.uk
>>>>>> NAMESPACE_NODE=dcache01.tier2.hep.manchester.ac.uk
>>>>>>
>>>>>> #  ----  Services to be started on this node
>>>>>> #    The following services are only started on this node
>>>>>> #    if the corresponding parameter is set to 'yes'.
>>>>>> #    Exeption: The PnfsManager is started on the admin node
>>>>>> #              if the parameter is not specified.
>>>>>> #
>>>>>> GSIDCAP=no
>>>>>> DCAP=no
>>>>>> GRIDFTP=no
>>>>>> SRM=yes
>>>>>> XROOTD=no
>>>>>>
>>>>>>
>>>>>> #
>>>>>> # Following variables is for admin node only #
>>>>>>
>>>>>> #  ----  Start the Replica Manager on this node.
>>>>>> #    The variable 'replicaManager' in config/dCacheSetup has to
be set
>>>>>> #    to 'yes' on every node of the dCache instance, if the
replica
>>>>>> manager
>>>>>> #    is started with the following variable
>>>>>> #    Make sure that there is only one replica manager running in
a
>>>>>> dCache
>>>>>> #    instance.
>>>>>> #
>>>>>> #replicaManager=no              # default: no
>>>>>> replicaManager=yes
>>>>>>
>>>>>>
>>>>>> #  ----  Start the info provider on this node.
>>>>>> #    With this variable, it is possible to install the info
provider
>>>>>> #    on a separate node and not on the admin node.
>>>>>> #
>>>>>> #infoProvider=yes                # default: 'yes' on 'admin' node
>>>>>> otherwise 'no'
>>>>>>
>>>>>>
>>>>>> #  ----  Start the statistics module # # Make sure that 
>>>>>> statisticsLocation variable in dCacheSetup file points to # an 
>>>>>> existing directory.
>>>>>> #
>>>>>> #statistics=no                # default: 'no'
>>>>>>
>>>>>>
>>>>>>
>>>>>> #################################################################
>>>>>> ###############
>>>>>> #
>>>>>>     #
>>>>>> #
>>>>>>     #
>>>>>> #         DO NOT MODIFY THIS PART UNLESS YOU KNOW WHAT YOU ARE
DOING
>>>>>>    #
>>>>>> #
>>>>>>     #
>>>>>> #                  USED ONLY IF NODE_TYPE=custom
>>>>>>    #
>>>>>> #
>>>>>>     #
>>>>>>
>>>>>>
>>>>>> #################################################################
>>>>>> ###############
>>>>>>
>>>>>> #
>>>>>> #  default components of a admin node #
>>>>>>
>>>>>> #
>>>>>> # Location manager. Single instace per dCache installation # 
>>>>>> Required.
>>>>>> #
>>>>>> lmDomain=yes
>>>>>>
>>>>>> #
>>>>>> # httpd service. Single instace per dCache installation # 
>>>>>> optional, recomented # httpDomain=yes
>>>>>>
>>>>>> #
>>>>>> # pnfs manager. Single instace per dCache installation # 
>>>>>> Required.
>>>>>> #
>>>>>> #pnfsManager=yes
>>>>>> pnfsManager=no
>>>>>>
>>>>>> #
>>>>>> # PoolManager manager (AKA dCacheDomain). Single instace per 
>>>>>> dCache installation # Required.
>>>>>> poolManager=yes
>>>>>>
>>>>>> #
>>>>>> # admin door. Single instace per dCache installation # optional, 
>>>>>> recomented # adminDoor=yes
>>>>>>
>>>>>> #
>>>>>> # utilities ( pinManager and Co.). Single instace per dCache 
>>>>>> installation # Required.
>>>>>> #
>>>>>> utilityDomain=yes
>>>>>>
>>>>>> #
>>>>>> # directory lookup service. Single instace per dCache 
>>>>>> installation # required if at least one dcapDoor is running # 
>>>>>> dirDomain=yes
>>>>>>
>>>>>> # gPlazma authentification serive. Single instace per dCache 
>>>>>> installation # #
>>>>>> #gPlazmaService=no                # default: 'no'
>>>>>> gPlazmaService=yes
>>>>>>
>>>>>>
>>>>>> Sergey
>>>>>>
>>>>>> 2008/6/11 Sergey <[log in to unmask]>:
>>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> We have a splitted dCache head node where the pnfs+psqlpostgre 
>>>>>>> is installed on remote node.
>>>>>>> On the head node we have
>>>>>>> PNFS_START=no
>>>>>>> and
>>>>>>> pnfsManager=no
>>>>>>>
>>>>>>> After server rpm upgrade from 1.8 12p6 to 1.8 15p5 we have had 
>>>>>>> to run /opt/d-cache/install/install.sh script to complete the 
>>>>>>> upgrade. It fails with the following error:
>>>>>>>
>>>>>>> [root@dcache01 etc]# /opt/d-cache/install/install.sh 
>>>>>>> INFO:Skipping ssh key generation
>>>>>>>
>>>>>>> Checking MasterSetup ./config/dCacheSetup O.k.
>>>>>>>
>>>>>>> Sanning dCache batch files
>>>>>>>
>>>>>>> Processing adminDoor
>>>>>>> Processing chimera
>>>>>>> Processing dCache
>>>>>>> Processing dir
>>>>>>> Processing door
>>>>>>> Processing gPlazma
>>>>>>> Processing gridftpdoor
>>>>>>> Processing gsidcapdoor
>>>>>>> Processing httpd
>>>>>>> Processing infoProvider
>>>>>>> Processing lm
>>>>>>> Processing pnfs
>>>>>>> Processing pool
>>>>>>> Processing replica
>>>>>>> Processing srm
>>>>>>> Processing statistics
>>>>>>> Processing utility
>>>>>>> Processing xrootdDoor
>>>>>>>
>>>>>>>
>>>>>>> Checking Users database .... Ok
>>>>>>> Checking Security .... Ok
>>>>>>> Checking JVM ........ Ok
>>>>>>> Checking Cells ...... Ok
>>>>>>> dCacheVersion ....... Version production-1-8-0-15p5
>>>>>>>
>>>>>>> INFO:Will be mounted to dcache01.tier2.hep.manchester.ac.uk:/fs 
>>>>>>> by dcache-core start-up script.
>>>>>>> ERROR:The file/directory /pnfs/tier2.hep.manchester.ac.uk is in 
>>>>>>> the way.
>>>>>>> Please move it out
>>>>>>> ERROR:of the way and call me again. Exiting.
>>>>>>>
>>>>>>> As a result the upgrade is not completed and
>>>>>>>
>>>>>>> 1. we still observe the old version on the Admin web page:
>>>>>>> SRM-dcache01 srm-dcache01Domain 0 3 80 msec 06/11 14:51:18
>>>>>>> production-1-8-0-12p6(1.151)
>>>>>>>
>>>>>>> 2. the srm protocol does not work properly we can srmcp TO the 
>>>>>>> dCache, delete with srmrm and srmcp from it, but the "srmcp TO" 
>>>>>>> hangs on the client side at the very end of operation and then 
>>>>>>> ends with error by
>>>>>>> timeout:
>>>>>>>
>>>>>>> SRMClientV2 : srmPrepareToPut, contacting service
>>>>>>> httpg://dcache01.tier2.hep.manchester.ac.uk:8443/srm/managerv2
>>>>>>> Wed Jun 11 16:48:22 BST 2008: srm returned requestToken = 
>>>>>>> -2147098616 Wed Jun 11 16:48:22 BST 2008: sleeping 1 seconds ...
>>>>>>> copy_jobs is not empty
>>>>>>> Wed Jun 11 16:48:23 BST 2008: no more pending transfers, 
>>>>>>> breaking the loop copying CopyJob, source = file:////bin/bash 
>>>>>>> destination =
>>>>>>>
>>>>>>>
>>>>>>> gsiftp://bohr3931.tier2.hep.manchester.ac.uk:2811//pnfs/tier2.he
>>>>>>> p.manchester.ac.uk/data/dteam/basht1
>>>>>>> GridftpClient: memory buffer size is set to 131072
>>>>>>> GridftpClient: connecting to bohr3931.tier2.hep.manchester.ac.uk

>>>>>>> on port
>>>>>>> 2811
>>>>>>> GridftpClient: gridFTPClient tcp buffer size is set to 0
>>>>>>> GridftpClient: gridFTPWrite started, source file is
>>>>>>> java.io.RandomAccessFile@2f0d54 destination path is
>>>>>>> /pnfs/tier2.hep.manchester.ac.uk/data/dteam/basht1
>>>>>>> GridftpClient: gridFTPWrite started, destination path is
>>>>>>> /pnfs/tier2.hep.manchester.ac.uk/data/dteam/basht1
>>>>>>> GridftpClient: set local data channel authentication mode to 
>>>>>>> None
>>>>>>> GridftpClient: parallelism: 10
>>>>>>> GridftpClient: adler32 for file java.io.RandomAccessFile@2f0d54 
>>>>>>> is
>>>>>>> 0b6f56d6
>>>>>>> GridftpClient: waiting for completion of transfer
>>>>>>> GridftpClient: starting a transfer to
>>>>>>> /pnfs/tier2.hep.manchester.ac.uk/data/dteam/basht1
>>>>>>> GridftpClient: DiskDataSink.close() called
>>>>>>> GridftpClient: gridFTPWrite() wrote 616248bytes
>>>>>>> GridftpClient: closing client :
>>>>>>> org.dcache.srm.util.GridftpClient$FnalGridFTPClient@15dd910
>>>>>>> GridftpClient: closed client
>>>>>>> execution of CopyJob, source = file:////bin/bash destination =
>>>>>>>
>>>>>>>
>>>>>>> gsiftp://bohr3931.tier2.hep.manchester.ac.uk:2811//pnfs/tier2.he
>>>>>>> p.manchester.ac.uk/data/dteam/basht1
>>>>>>> completed
>>>>>>> SRMClientV2 : put: try # 0 failed with error
>>>>>>> SRMClientV2 : ; nested exception is:
>>>>>>> java.net.SocketTimeoutException: Read timed out
>>>>>>> SRMClientV2 : put: try again
>>>>>>> SRMClientV2 : sleeping for 10000 milliseconds before retrying
>>>>>>> SRMClientV2 : put: try # 1 failed with error
>>>>>>> SRMClientV2 : ; nested exception is:
>>>>>>> ....
>>>>>>>
>>>>>>> Although the file appers in place and can be copied back or 
>>>>>>> deleted with srmrm.
>>>>>>> Also srmls command does not work at all (just hangs).
>>>>>>>
>>>>>>> Does anybody have a suggestion how to complete the upgrade
correctly?
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Sergey
>>>>>>>
>>>>>>>  Manchester Tier2
>>>>>>>
>>>>>
>>>
>>>
>>
>
>
Top of Message | Previous Page | Permalink
JiscMail Tools

Files Area | help
RSS Feeds and Sharing

Search Archives

Advanced Options