Hi
Have to add to our previous email (below) that pnfsDomain always
started on the head node after /opt/d-cache/bin/dcache start/restart
though the node was configured in the node_config to run with a remote
pnfs and not start pnfs on this one. Here is our node_config on the
head node:
# $Id: node_config.template,v 1.6 2007-06-19 10:04:10 tigran Exp $
#
NODE_TYPE=admin #admin, pool, door or custom
DCACHE_HOME=/opt/d-cache
POOL_PATH=/opt/d-cache/etc
NUMBER_OF_MOVERS=100
#
# which namespace in installed?
#
# Possible values is chimera or pnfs
# if nothing is defined or none of above, then pnfs is used
#
NAMESPACE=''
PNFS_ROOT=/pnfs
PNFS_INSTALL_DIR=/opt/pnfs
#PNFS_START=yes
PNFS_START=no
PNFS_OVERWRITE=no
# SERVER_ID=domain.name # defaults to `hostname -d`
SERVER_ID=tier2.hep.manchester.ac.uk
#ADMIN_NODE=myAdminNode # only needed for GridFTP door which
is not on the admin node
ADMIN_NODE=dcache01.tier2.hep.manchester.ac.uk
NAMESPACE_NODE=dcache01.tier2.hep.manchester.ac.uk
# ---- Services to be started on this node
# The following services are only started on this node
# if the corresponding parameter is set to 'yes'.
# Exeption: The PnfsManager is started on the admin node
# if the parameter is not specified.
#
GSIDCAP=no
DCAP=no
GRIDFTP=no
SRM=yes
XROOTD=no
#
# Following variables is for admin node only
#
# ---- Start the Replica Manager on this node.
# The variable 'replicaManager' in config/dCacheSetup has to be set
# to 'yes' on every node of the dCache instance, if the replica manager
# is started with the following variable
# Make sure that there is only one replica manager running in a dCache
# instance.
#
#replicaManager=no # default: no
replicaManager=yes
# ---- Start the info provider on this node.
# With this variable, it is possible to install the info provider
# on a separate node and not on the admin node.
#
#infoProvider=yes # default: 'yes' on 'admin' node otherwise 'no'
# ---- Start the statistics module
#
# Make sure that statisticsLocation variable in dCacheSetup file points to
# an existing directory.
#
#statistics=no # default: 'no'
################################################################################
# #
# #
# DO NOT MODIFY THIS PART UNLESS YOU KNOW WHAT YOU ARE DOING #
# #
# USED ONLY IF NODE_TYPE=custom #
# #
################################################################################
#
# default components of a admin node
#
#
# Location manager. Single instace per dCache installation
# Required.
#
lmDomain=yes
#
# httpd service. Single instace per dCache installation
# optional, recomented
#
httpDomain=yes
#
# pnfs manager. Single instace per dCache installation
# Required.
#
#pnfsManager=yes
pnfsManager=no
#
# PoolManager manager (AKA dCacheDomain). Single instace per dCache installation
# Required.
poolManager=yes
#
# admin door. Single instace per dCache installation
# optional, recomented
#
adminDoor=yes
#
# utilities ( pinManager and Co.). Single instace per dCache installation
# Required.
#
utilityDomain=yes
#
# directory lookup service. Single instace per dCache installation
# required if at least one dcapDoor is running
#
dirDomain=yes
# gPlazma authentification serive. Single instace per dCache installation
#
#
#gPlazmaService=no # default: 'no'
gPlazmaService=yes
Sergey
2008/6/11 Sergey <[log in to unmask]>:
> Hi
>
> We have a splitted dCache head node where the pnfs+psqlpostgre is
> installed on remote node.
> On the head node we have
> PNFS_START=no
> and
> pnfsManager=no
>
> After server rpm upgrade from 1.8 12p6 to 1.8 15p5 we have had to run
> /opt/d-cache/install/install.sh script to complete the upgrade. It fails
> with the following error:
>
> [root@dcache01 etc]# /opt/d-cache/install/install.sh
> INFO:Skipping ssh key generation
>
> Checking MasterSetup ./config/dCacheSetup O.k.
>
> Sanning dCache batch files
>
> Processing adminDoor
> Processing chimera
> Processing dCache
> Processing dir
> Processing door
> Processing gPlazma
> Processing gridftpdoor
> Processing gsidcapdoor
> Processing httpd
> Processing infoProvider
> Processing lm
> Processing pnfs
> Processing pool
> Processing replica
> Processing srm
> Processing statistics
> Processing utility
> Processing xrootdDoor
>
>
> Checking Users database .... Ok
> Checking Security .... Ok
> Checking JVM ........ Ok
> Checking Cells ...... Ok
> dCacheVersion ....... Version production-1-8-0-15p5
>
> INFO:Will be mounted to dcache01.tier2.hep.manchester.ac.uk:/fs by
> dcache-core start-up script.
> ERROR:The file/directory /pnfs/tier2.hep.manchester.ac.uk is in the way.
> Please move it out
> ERROR:of the way and call me again. Exiting.
>
> As a result the upgrade is not completed and
>
> 1. we still observe the old version on the Admin web page:
> SRM-dcache01 srm-dcache01Domain 0 3 80 msec 06/11 14:51:18
> production-1-8-0-12p6(1.151)
>
> 2. the srm protocol does not work properly we can srmcp TO the
> dCache, delete with srmrm and srmcp from it, but the "srmcp TO" hangs
> on the client
> side at the very end of operation and then ends with error by timeout:
>
> SRMClientV2 : srmPrepareToPut, contacting service
> httpg://dcache01.tier2.hep.manchester.ac.uk:8443/srm/managerv2
> Wed Jun 11 16:48:22 BST 2008: srm returned requestToken = -2147098616
> Wed Jun 11 16:48:22 BST 2008: sleeping 1 seconds ...
> copy_jobs is not empty
> Wed Jun 11 16:48:23 BST 2008: no more pending transfers, breaking the loop
> copying CopyJob, source = file:////bin/bash destination =
> gsiftp://bohr3931.tier2.hep.manchester.ac.uk:2811//pnfs/tier2.hep.manchester.ac.uk/data/dteam/basht1
> GridftpClient: memory buffer size is set to 131072
> GridftpClient: connecting to bohr3931.tier2.hep.manchester.ac.uk on port
> 2811
> GridftpClient: gridFTPClient tcp buffer size is set to 0
> GridftpClient: gridFTPWrite started, source file is
> java.io.RandomAccessFile@2f0d54 destination path is
> /pnfs/tier2.hep.manchester.ac.uk/data/dteam/basht1
> GridftpClient: gridFTPWrite started, destination path is
> /pnfs/tier2.hep.manchester.ac.uk/data/dteam/basht1
> GridftpClient: set local data channel authentication mode to None
> GridftpClient: parallelism: 10
> GridftpClient: adler32 for file java.io.RandomAccessFile@2f0d54 is 0b6f56d6
> GridftpClient: waiting for completion of transfer
> GridftpClient: starting a transfer to
> /pnfs/tier2.hep.manchester.ac.uk/data/dteam/basht1
> GridftpClient: DiskDataSink.close() called
> GridftpClient: gridFTPWrite() wrote 616248bytes
> GridftpClient: closing client :
> org.dcache.srm.util.GridftpClient$FnalGridFTPClient@15dd910
> GridftpClient: closed client
> execution of CopyJob, source = file:////bin/bash destination =
> gsiftp://bohr3931.tier2.hep.manchester.ac.uk:2811//pnfs/tier2.hep.manchester.ac.uk/data/dteam/basht1
> completed
> SRMClientV2 : put: try # 0 failed with error
> SRMClientV2 : ; nested exception is:
> java.net.SocketTimeoutException: Read timed out
> SRMClientV2 : put: try again
> SRMClientV2 : sleeping for 10000 milliseconds before retrying
> SRMClientV2 : put: try # 1 failed with error
> SRMClientV2 : ; nested exception is:
> ....
>
> Although the file appers in place and can be copied back or deleted with srmrm.
> Also srmls command does not work at all (just hangs).
>
> Does anybody have a suggestion how to complete the upgrade correctly?
>
> Thanks
>
> Sergey
>
> Manchester Tier2
>
|