Print

Print


Hi

We have a splitted dCache head node where the pnfs+psqlpostgre is
installed on remote node.
On the head node we have
PNFS_START=no
and
pnfsManager=no

After server rpm upgrade from 1.8 12p6 to 1.8 15p5 we have had to run
/opt/d-cache/install/install.sh script to complete the upgrade. It fails
with the following error:

[root@dcache01 etc]# /opt/d-cache/install/install.sh
INFO:Skipping ssh key generation

Checking MasterSetup ./config/dCacheSetup O.k.

Sanning dCache batch files

Processing adminDoor
Processing chimera
Processing dCache
Processing dir
Processing door
Processing gPlazma
Processing gridftpdoor
Processing gsidcapdoor
Processing httpd
Processing infoProvider
Processing lm
Processing pnfs
Processing pool
Processing replica
Processing srm
Processing statistics
Processing utility
Processing xrootdDoor


Checking Users database .... Ok
Checking Security .... Ok
Checking JVM ........ Ok
Checking Cells ...... Ok
dCacheVersion ....... Version production-1-8-0-15p5

INFO:Will be mounted to dcache01.tier2.hep.manchester.ac.uk:/fs by
dcache-core start-up script.
ERROR:The file/directory /pnfs/tier2.hep.manchester.ac.uk is in the way.
Please move it out
ERROR:of the way and call me again. Exiting.

As a result the upgrade is not completed and

1. we still observe the old version on the Admin web page:
SRM-dcache01 srm-dcache01Domain 0 3 80 msec 06/11 14:51:18
production-1-8-0-12p6(1.151)

2. the srm protocol does not work properly we can srmcp TO the
dCache, delete with srmrm and srmcp from it, but the "srmcp TO" hangs
on the client
side at the very end of operation and then ends with error by timeout:

SRMClientV2 : srmPrepareToPut, contacting service
httpg://dcache01.tier2.hep.manchester.ac.uk:8443/srm/managerv2
Wed Jun 11 16:48:22 BST 2008: srm returned requestToken = -2147098616
Wed Jun 11 16:48:22 BST 2008: sleeping 1 seconds ...
copy_jobs is not empty
Wed Jun 11 16:48:23 BST 2008: no more pending transfers, breaking the loop
copying CopyJob, source = file:////bin/bash destination =
gsiftp://bohr3931.tier2.hep.manchester.ac.uk:2811//pnfs/tier2.hep.manchester.ac.uk/data/dteam/basht1
GridftpClient: memory buffer size is set to 131072
GridftpClient: connecting to bohr3931.tier2.hep.manchester.ac.uk on port
2811
GridftpClient: gridFTPClient tcp buffer size is set to 0
GridftpClient: gridFTPWrite started, source file is
java.io.RandomAccessFile@2f0d54 destination path is
/pnfs/tier2.hep.manchester.ac.uk/data/dteam/basht1
GridftpClient: gridFTPWrite started, destination path is
/pnfs/tier2.hep.manchester.ac.uk/data/dteam/basht1
GridftpClient: set local data channel authentication mode to None
GridftpClient: parallelism: 10
GridftpClient: adler32 for file java.io.RandomAccessFile@2f0d54 is 0b6f56d6
GridftpClient: waiting for completion of transfer
GridftpClient: starting a transfer to
/pnfs/tier2.hep.manchester.ac.uk/data/dteam/basht1
GridftpClient: DiskDataSink.close() called
GridftpClient: gridFTPWrite() wrote 616248bytes
GridftpClient: closing client :
org.dcache.srm.util.GridftpClient$FnalGridFTPClient@15dd910
GridftpClient: closed client
execution of CopyJob, source = file:////bin/bash destination =
gsiftp://bohr3931.tier2.hep.manchester.ac.uk:2811//pnfs/tier2.hep.manchester.ac.uk/data/dteam/basht1
completed
SRMClientV2 : put: try # 0 failed with error
SRMClientV2 : ; nested exception is:
java.net.SocketTimeoutException: Read timed out
SRMClientV2 : put: try again
SRMClientV2 : sleeping for 10000 milliseconds before retrying
SRMClientV2 : put: try # 1 failed with error
SRMClientV2 : ; nested exception is:
....

Although the file appers in place and can be copied back or deleted with srmrm.
Also srmls command does not work at all (just hangs).

Does anybody have a suggestion how to complete the upgrade correctly?

Thanks

Sergey

 Manchester Tier2