Hi All,
I'm getting a lot of random failures in the SFTs from my dCache where
the write of the file to the dCache appears successful but then when the
SFT tries to read the file back you get:
+ lcg-cp -v --vo dteam lfn:sft-lcg-rm-cr-heplnx48.pp.rl.ac.uk.0605220722
file:///scratch/WMS_heplnx48_018249_https_3a_2f_2fgdrb02.cern.ch_3a9000_
2fLxXmsliu9ehFjCWOYEcxQg/sft-lcg-rm-cp.txt
the server sent an error response: 553 553 Permission denied, reason:
CacheException(rc=666;msg=can't get pnfsId (not a pnfsfile))
lcg_cp: Permission denied
Using grid catalog type: lfc
Using grid catalog : prod-lfc-shared-central.cern.ch
It appears that the write was indeed successful because the same SFT can
later replicate it to CERN:
Replicate the file from the default SE to castorgrid.cern.ch
+ lcg-rep -v --vo dteam -d castorgrid.cern.ch
lfn:sft-lcg-rm-cr-heplnx48.pp.rl.ac.uk.0605220722
0 bytes 0.00 KB/sec avg 0.00 KB/sec inst
0 bytes 0.00 KB/sec avg 0.00 KB/sec inst
0 bytes 0.00 KB/sec avg 0.00 KB/sec instUsing grid
catalog type: lfc
Using grid catalog : prod-lfc-shared-central.cern.ch
Source URL:
lfn:/grid/dteam/SFT/sft-lcg-rm-cr-heplnx48.pp.rl.ac.uk.0605220722
File size: 233
VO name: dteam
Destination specified: castorgrid.cern.ch
Source URL for copy:
gsiftp://heplnx204.pp.rl.ac.uk:2811//pnfs/pp.rl.ac.uk/data/dteam/generat
ed/2006-05-22/file330985b9-5368-4e67-82ec-5ee6f6fd4fa8
Destination URL for copy:
gsiftp://castorgrid.cern.ch/castor/cern.ch/grid/dteam/generated/2006-05-
22/file8c15f735-de68-4949-aba5-33c9098462ff
# streams: 1
# set timeout to 0
Transfer took 2020 ms
Destination URL registered in LRC:
sfn://castorgrid.cern.ch/castor/cern.ch/grid/dteam/generated/2006-05-22/
file8c15f735-de68-4949-aba5-33c9098462ff
+ result=0
+ set +x
List replicas to check if replication was really successful
+ lcg-lr --vo dteam lfn:sft-lcg-rm-cr-heplnx48.pp.rl.ac.uk.0605220722
sfn://castorgrid.cern.ch/castor/cern.ch/grid/dteam/generated/2006-05-22/
file8c15f735-de68-4949-aba5-33c9098462ff
srm://heplnx204.pp.rl.ac.uk/pnfs/pp.rl.ac.uk/data/dteam/generated/2006-0
5-22/file330985b9-5368-4e67-82ec-5ee6f6fd4fa8
+ set +x
I was always getting a few of these but since I added extra VOs a week
ago I now seem to failing between 30 and 50% of the SFT runs with this
alone.
I haven't managed to replicate the error by copying files in and out
multiple times and the SFT deletes the file so I cannot check the status
of the file the see the error with.
Googling for the error seems to show that it's not uncommon but I don't
see and indications of cause or solution. There doesn't seem to be
anything in the logs.
Anyone know what I can do about this (other than install DPM)?
Thanks,
Chris.
Examples taken from:
https://lcg-sft.cern.ch/sft/info/heplnx201.pp.rl.ac.uk/sft_2006-05-22_07
.10.05.html#sft-lcg-rm_2006-05-22_07:22:49
|