Thanks Grieg.
I don't understand the errors but:
lcg-cr --vo dteam -v -d heplnx204.pp.rl.ac.uk
file:/home/hephp/brew/9936870.moose.01.root
works whereas:
lcg-cr --vo dteam -v -d heplnx204.pp.rl.ac.uk
file://home/hephp/brew/9936870.moose.01.root
doesn't
OK, the file:// url isn't valid (no less so than file:////) but what the
hell is it breaking that causes the "File Exists" error message from
dCache?
Now to try to put my system back together.
Thanks again,
Chris "Got a headache from banging my head against the wall" Brew
> -----Original Message-----
> From: Greig A Cowan [mailto:[log in to unmask]]
> Sent: 24 October 2006 14:46
> To: [log in to unmask]; [log in to unmask]
> Subject: RE: lcg-cr fails with "File Exists"
>
>
> Chris,
>
> I just ran this command and it worked:
>
> $ lcg-cr -v --vo dteam file:/tmp/greig1 -d
> srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/dteam/g
> reig_test_dir/test-10-24
> Using grid catalog type: lfc
> Using grid catalog : prod-lfc-shared-central.cern.ch Source
> URL: file:/tmp/greig1 File size: 17371 VO name: dteam
> Destination specified:
> srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/dteam/g
> reig_test_dir/test-10-24
> Destination URL for copy:
> gsiftp://heplnx204.pp.rl.ac.uk:2811//pnfs/pp.rl.ac.uk/data/dte
> am/greig_test_dir/test-10-24
> # streams: 1
> # set timeout to 0 seconds
> Alias registered in Catalog:
> lfn:/grid/dteam/generated/2006-10-24/file-16b145d4-b9ec-4170-b
> 4b2-94af6433e9c7
> 17371 bytes 17.39 KB/sec avg 17.39 KB/sec inst
> Transfer took 2050 ms
> Destination URL registered in Catalog:
> srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/dteam/g
> reig_test_dir/test-10-24
> guid:0a575034-5bb7-423c-891b-8a9a101659eb
>
> and I could copy it back:
>
> $ lcg-cp -v --vo dteam
> srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/dteam/g
> reig_test_dir/test-10-24
> file:/tmp/greig2
> Using grid catalog type: lfc
> Using grid catalog : prod-lfc-shared-central.cern.ch Source URL:
> srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/dteam/g
> reig_test_dir/test-10-24
> File size: 17371
> VO name: dteam
> Source URL for copy:
> gsiftp://heplnx204.pp.rl.ac.uk:2811//pnfs/pp.rl.ac.uk/data/dte
> am/greig_test_dir/test-10-24
> Destination URL: file:/tmp/greig2
> # streams: 1
> # set timeout to 0 (seconds)
> 0 bytes 0.00 KB/sec avg 0.00 KB/sec inst
> Transfer took 2030 ms
>
>
> Greig
>
>
> On Tue, 24 Oct 2006, Brew, CAJ (Chris) wrote:
>
> > Hi,
> >
> > Thanks (I think). The relavant bit of the logs for an
> lcg-cr appears
> > to
> > be:
> >
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : REPLY(reset=true GReplyType=enc): <211-OK EOF PARALLEL
> SIZE SBUF
> > ERET ESTO
> > 211 End>
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : Tue Oct 24 14:19:55 BST 2006: FTP CMD: <enc ... >
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : Secure command: <TYPE I>
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : Tue Oct 24 14:19:55 BST 2006: FTP CMD: <TYPE I>
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : REPLY(reset=true GReplyType=enc): <200 Type set to I>
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : Tue Oct 24 14:19:55 BST 2006: FTP CMD: <enc ... >
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : Secure command: <MKD
> > //pnfs/pp.rl.ac.uk/data/dteam/generated/2006-10-24>
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : Tue Oct 24 14:19:55 BST 2006: FTP CMD: <MKD
> > //pnfs/pp.rl.ac.uk/data/dteam/generated/2006-10-24>
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : absolute path is
> "/pnfs/pp.rl.ac.uk/data/dteam/generated/2006-10-24"
> > root is /
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : PermissionHandler:
> >
> canWrite(37401,24311,/pnfs/pp.rl.ac.uk/data/dteam/generated/2006-10-24
> > )
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : PermissionHandler: fileCanWrite() file meta =
> >
> [drwxr-xr-x;37401;24311][c=10.23-11:58:24;m=10.24-01:38:44;a=10.24-01:
> > 38
> > :44]
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : PermissionHandler: fileCanWrite() file write allowed : true
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : PnfsHandler : CacheException (2) : Pnfs error : failed :
> File exists
> > 10/24 14:19:55
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : REPLY(reset=true GReplyType=enc): <553 Permission denied, reason:
> > CacheException(rc=2;msg=Pnfs error : failed : File exists)>
> > 10/24 14:19:57
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : AbstractFtpDoorV1::run(): _pnfsEntryIncomplete=false, _path=null
> > 10/24 14:19:57
> > Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
> > : EOS encountered
> >
> > Looking at the log for a successful srmcp that appears to
> use "MODE E"
> > not "TYPE I" and from then the logging looks completely different.
> > IIRC "MODE E" is some sort of append mode so maybe that's
> why srmcp works.
> > The real problem is why pnfs thinks the file exists even when it
> > doesn't.
> >
> > The problem appears to be the same for all VOs and target
> directories
> > so it's unlikely to be "phantom" entries in pnfs.
> >
> > I guess I'll be looking in the pnfsDomain logs now.
> >
> > Thanks,
> > Chris.
> >
> > > -----Original Message-----
> > > From: Greig A Cowan [mailto:[log in to unmask]]
> > > Sent: 24 October 2006 14:16
> > > To: [log in to unmask]; [log in to unmask]
> > > Subject: RE: lcg-cr fails with "File Exists"
> > >
> > > Chris,
> > >
> > > You can increase the level of logging in the gridftp door logs by
> > > increasing the number in the first line of the gridftpdoor.batch
> > > file
> > >
> > > set printout default 2
> > >
> > > going to 3 or 4 should give you plenty of extra information.
> > >
> > > Greig
> > >
> > > On Tue, 24 Oct 2006, Brew, CAJ (Chris) wrote:
> > >
> > > > Hi Jon,
> > > >
> > > > I don't think so. The stuff from the kpwd file is the "/ /
> > > /" at the
> > > > end of the
> > > >
> > > > login <username> ...
> > > >
> > > > lines right?
> > > >
> > > > They're all set to the same and I haven't changed anything in
> > > > srm.batch
> > > > - though I cannot see anything that looks like a user root
> > > in that file.
> > > >
> > > > The only log entry I can find for the initial error is int the
> > > > gridftpdoor log and only consists of:
> > > >
> > > > 10/24 13:25:26
> > > > Cell(GFTP-heplnx204-Unknown-115@gridftp-heplnx204Domain)
> > > > : PnfsHandler : CacheException (2) : Pnfs error : failed :
> > > File exists
> > > >
> > > > which isn't very informative and of course lcg-cr
> doesn't have a
> > > > proper --debug option so it's very hard to tell exactly
> > > what it's doing.
> > > >
> > > > The other very limited errors all seem to relate to the failed
> > > > srm-advisory delete. Can I turn the logging up any higher?
> > > >
> > > > Thanks,
> > > > Chris.
> > > >
> > > > > -----Original Message-----
> > > > > From: bakken [mailto:[log in to unmask]]
> > > > > Sent: 24 October 2006 12:25
> > > > > To: [log in to unmask]
> > > > > Subject: Re: lcg-cr fails with "File Exists"
> > > > >
> > > > > My only guess is that the extra file components gridftp and
> > > > > srmcp can add to the filepath. For example, the dcache.kpwd
> > > > > file has a user root that is added, and srmcp has a different
> > > > > one in the srm batch file. Many people run with these
> both set
> > > > > to /, so
> > > they become
> > > > > degenerate. The gridftp component changes depending on
> > > the user, so
> > > > > your DN can make a difference.
> > > > >
> > > > > Complete filenames should be listed in the log, but it
> > > may be hard
> > > > > to find.
> > > > >
> > > > > Any chance that's it?
> > > > >
> > > > > <R4551|0n 7h3r3'5 n0 p|4c3 |1k3 127.0.0.1
> > > > > Jon A. Bakken [log in to unmask] (630) 840-4790
> > > > >
> > > > > On Tue, 24 Oct 2006, Brew, CAJ (Chris) wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > I've got a problem with dcache that I've not seen before:
> > > > > >
> > > > > > lcg-cr fails with "File Exists" even though it doesn't.
> > > > > >
> > > > > > heplnx101 - ~ $ lcg-cr --vo babar -v -d
> > > > > >
> > > > >
> > >
> srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/babar/brew/te
> > > > > st
> > > > > > -2
> > > > > > 00610241145 file://home/hephp/brew/9936870.moose.01.root
> > > > > > Using grid catalog type: lfc
> > > > > > Using grid catalog : lfcserver.cnaf.infn.it Source URL:
> > > > > > file://home/hephp/brew/9936870.moose.01.root
> > > > > > File size: 13033805
> > > > > > VO name: babar
> > > > > > Destination specified:
> > > > > >
> > > > >
> > >
> srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/babar/brew/te
> > > > > st
> > > > > > -2
> > > > > > 00610241145
> > > > > > Destination URL for copy:
> > > > > >
> > > > >
> > >
> gsiftp://heplnx165.pp.rl.ac.uk:2811//pnfs/pp.rl.ac.uk/data/babar/bre
> > > > > w/
> > > > > > te
> > > > > > st-200610241145
> > > > > > # streams: 1
> > > > > > # set timeout to 0 seconds
> > > > > > Alias registered in Catalog:
> > > > > >
> > > > >
> > >
> lfn:/grid/babar/generated/2006-10-24/file-981c5a78-d7fd-4481-88d4-c8
> > > > > fa
> > > > > > 6a
> > > > > > a4998f
> > > > > > the server sent an error response: 553 553 Permission
> > > > > denied, reason:
> > > > > > CacheException(rc=2;msg=Pnfs error : failed : File exists)
> > > > > >
> > > > > > Copy Failed: Unregistering alias from catalog.
> > > > > > advisoryDelete(User [name=babar001, uid=37201, gid=24267,
> > > > > > root=/],pnfs/pp.rl.ac.uk/data/babar/brew/test-200610241145)
> > > > > Error file
> > > > > > does not exist, cannot delete
> > > > > >
> > > > > > I can however srmcp, or globus-url-copy the a file to
> > > the same path:
> > > > > >
> > > > > > heplnx101 - ~ $ srmcp -debug
> > > > > > file:////home/hephp/brew/9936870.moose.01.root
> > > > > >
> > > > >
> > >
> srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/dteam/brew/te
> > > > > st
> > > > > > -2
> > > > > > 00610241145
> > > > > > Storage Resource Manager (SRM) CP Client version 1.21
> > > > > > Copyright
> > > > > > (c)
> > > > > > 2002-2006 Fermi National Accelerator Laboratory
> > > > > >
> > > > > > SRM Configuration:
> > > > > > debug=true
> > > > > > gsissl=true
> > > > > > help=false
> > > > > > pushmode=false
> > > > > > userproxy=true
> > > > > > buffer_size=131072
> > > > > > tcp_buffer_size=0
> > > > > > streams_num=10
> > > > > > config_file=/home/hephp/brew/.srmconfig/config.xml
> > > > > > glue_mapfile=/opt/d-cache/srm/conf/SRMServerV1.map
> > > > > > webservice_path=srm/managerv1.wsdl
> > > > > > webservice_protocol=https
> > > > > > gsiftpclinet=globus-url-copy
> > > > > > protocols_list=http,gsiftp
> > > > > > save_config_file=null
> > > > > > srmcphome=/opt/d-cache/srm
> > > > > > urlcopy=/opt/d-cache/srm/sbin/url-copy.sh
> > > > > > x509_user_cert=/home/hephp/brew/.globus/usercert.pem
> > > > > > x509_user_key=/home/hephp/brew/.globus/userkey.pem
> > > > > > x509_user_proxy=/tmp/x509up_u24431
> > > > > >
> > > > > x509_user_trusted_certificates=/etc/grid-security/certificates
> > > > > > globus_tcp_port_range=null
> > > > > > gss_expected_name=null
> > > > > > storagetype=permanent
> > > > > > retry_num=20
> > > > > > retry_timeout=10000
> > > > > > wsdl_url=null
> > > > > > use_urlcopy_script=false
> > > > > > connect_to_wsdl=false
> > > > > > delegate=true
> > > > > > full_delegation=true
> > > > > > server_mode=passive
> > > > > > srm_protocol_version=1
> > > > > > request_lifetime=86400
> > > > > >
> from[0]=file:////home/hephp/brew/9936870.moose.01.root
> > > > > >
> > > > > >
> > > > >
> > >
> to=srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/dteam/brew
> > > > > /t
> > > > > > es
> > > > > > t-200610241145
> > > > > >
> > > > > > Tue Oct 24 12:01:21 BST 2006: starting SRMPutClient
> Tue Oct 24
> > > > > > 12:01:21 BST 2006: In SRMClient ExpectedName: host Tue Oct
> > > > > 24 12:01:21
> > > > > > BST 2006: SRMClient(https,srm/managerv1,true)
> > > > > > SRMClientV1 : user credentials are:
> > > > > > /C=UK/O=eScience/OU=CLRC/L=RAL/CN=chris dteam brew
> > > > > > SRMClientV1 : SRMClientV1 calling
> > > > > > org.globus.axis.util.Util.registerTransport()
> > > > > > SRMClientV1 : connecting to srm at
> > > > > > httpg://heplnx204.pp.rl.ac.uk:8443/srm/managerv1
> > > > > > Tue Oct 24 12:01:22 BST 2006: connected to server,
> > > > > obtaining proxy Tue
> > > > > > Oct 24 12:01:22 BST 2006: got proxy of type class
> > > > > > org.dcache.srm.client.SRMClientV1 Tue Oct 24 12:01:22 BST
> > > > > > 2006: source file#0 :
> > > > > > /home/hephp/brew/9936870.moose.01.root
> > > > > > SRMClientV1 : put,
> > > > > >
> > > > >
> > >
> sources[0]="srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/d
> > > > > te
> > > > > > am
> > > > > > /brew/test-200610241145"
> > > > > > SRMClientV1 : put,
> > > > > >
> > > > >
> > >
> dests[0]="srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/dte
> > > > > am
> > > > > > /b
> > > > > > rew/test-200610241145"
> > > > > > SRMClientV1 : put, protocols[0]="http"
> > > > > > SRMClientV1 : put, protocols[1]="dcap"
> > > > > > SRMClientV1 : put, protocols[2]="gsiftp"
> > > > > > SRMClientV1 : put, contacting service
> > > > > > httpg://heplnx204.pp.rl.ac.uk:8443/srm/managerv1
> > > > > > copy_jobs is empty
> > > > > > Tue Oct 24 12:01:26 BST 2006: srm returned requestId =
> > > > > > -2146176534 Tue Oct 24 12:01:26 BST 2006: sleeping
> 1 seconds ...
> > > > > > Tue Oct 24 12:01:28 BST 2006: FileRequestStatus with
> > > > > >
> > > > >
> > >
> SURL=srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/dteam/br
> > > > > ew
> > > > > > /t
> > > > > > est-200610241145 is Ready
> > > > > > Tue Oct 24 12:01:28 BST 2006: received
> > > > > >
> > > > >
> > >
> TURL=gsiftp://heplnx165.pp.rl.ac.uk:2811//pnfs/pp.rl.ac.uk/data/dtea
> > > > > m/
> > > > > > br
> > > > > > ew/test-200610241145
> > > > > > copy_jobs is not empty
> > > > > > copying CopyJob, source =
> > > > > > file:////home/hephp/brew/9936870.moose.01.root
> > > > > > destination =
> > > > > >
> > > > >
> > >
> gsiftp://heplnx165.pp.rl.ac.uk:2811//pnfs/pp.rl.ac.uk/data/dteam/bre
> > > > > w/
> > > > > > te
> > > > > > st-200610241145
> > > > > > GridftpClient: memory buffer size is set to 131072
> > > > > > GridftpClient: connecting to heplnx165.pp.rl.ac.uk on port
> > > > > > 2811
> > > > > > GridftpClient: gridFTPClient tcp buffer size is set to 0
> > > > > > GridftpClient: gridFTPWrite started, source file is
> > > > > > java.io.RandomAccessFile@12aea3e destination path is
> > > > > > /pnfs/pp.rl.ac.uk/data/dteam/brew/test-200610241145
> > > > > > GridftpClient: gridFTPWrite started, destination path is
> > > > > > /pnfs/pp.rl.ac.uk/data/dteam/brew/test-200610241145
> > > > > > GridftpClient: set local data channel authentication
> > > mode to None
> > > > > > GridftpClient: parallelism: 10
> > > > > > GridftpClient: adler 32 for file
> > > java.io.RandomAccessFile@12aea3e
> > > > > > is
> > > > > > 2932796037
> > > > > > GridftpClient: waiting for completion of transfer
> > > > > > GridftpClient: gridFtpWrite: starting the transfer
> in emode to
> > > > > > /pnfs/pp.rl.ac.uk/data/dteam/brew/test-200610241145
> > > > > > GridftpClient: DiskDataSink.close() called
> > > > > > GridftpClient: gridFTPWrite() wrote 13033805bytes
> > > > > > GridftpClient: closing client :
> > > > > > org.dcache.srm.util.GridftpClient$FnalGridFTPClient@141fab6
> > > > > > GridftpClient: closed client
> > > > > > execution of CopyJob, source =
> > > > > > file:////home/hephp/brew/9936870.moose.01.root destination =
> > > > > >
> > > > >
> > >
> gsiftp://heplnx165.pp.rl.ac.uk:2811//pnfs/pp.rl.ac.uk/data/dteam/bre
> > > > > w/
> > > > > > te
> > > > > > st-200610241145 completed
> > > > > > setting file request -2146176533 status to Done
> > > copy_jobs is empty
> > > > > > stopping copier
> > > > > >
> > > > > > I can't find much in the logs and have restarted
> > > everything I can
> > > > > > think of without fixing the problem.
> > > > > >
> > > > > > This is with 1.6.6
> > > > > >
> > > > > > Anyone got any ideas?
> > > > > >
> > > > > > Thanks,
> > > > > > Chris.
> > > > > >
> > > > >
> > > >
> > >
> > > --
> > > ==============================================================
> > > ==========
> > > Dr Greig A Cowan
> > > http://www.ph.ed.ac.uk/~gcowan1
> > > School of Physics, University of Edinburgh, James Clerk Maxwell
> > > Building
> > >
> > > TIER-2 STORAGE SUPPORT PAGES:
> > > http://wiki.gridpp.ac.uk/wiki/Grid_Storage
> > > ==============================================================
> > > ==========
> > >
> >
>
> --
> ==============================================================
> ==========
> Dr Greig A Cowan
> http://www.ph.ed.ac.uk/~gcowan1
> School of Physics, University of Edinburgh, James Clerk
> Maxwell Building
>
> TIER-2 STORAGE SUPPORT PAGES:
> http://wiki.gridpp.ac.uk/wiki/Grid_Storage
> ==============================================================
> ==========
>
|