Hi,
Thanks (I think). The relavant bit of the logs for an lcg-cr appears to
be:
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: REPLY(reset=true GReplyType=enc): <211-OK
EOF
PARALLEL
SIZE
SBUF
ERET
ESTO
211 End>
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: Tue Oct 24 14:19:55 BST 2006: FTP CMD: <enc ... >
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: Secure command: <TYPE I>
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: Tue Oct 24 14:19:55 BST 2006: FTP CMD: <TYPE I>
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: REPLY(reset=true GReplyType=enc): <200 Type set to I>
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: Tue Oct 24 14:19:55 BST 2006: FTP CMD: <enc ... >
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: Secure command: <MKD
//pnfs/pp.rl.ac.uk/data/dteam/generated/2006-10-24>
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: Tue Oct 24 14:19:55 BST 2006: FTP CMD: <MKD
//pnfs/pp.rl.ac.uk/data/dteam/generated/2006-10-24>
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: absolute path is "/pnfs/pp.rl.ac.uk/data/dteam/generated/2006-10-24"
root is /
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: PermissionHandler:
canWrite(37401,24311,/pnfs/pp.rl.ac.uk/data/dteam/generated/2006-10-24)
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: PermissionHandler: fileCanWrite() file meta =
[drwxr-xr-x;37401;24311][c=10.23-11:58:24;m=10.24-01:38:44;a=10.24-01:38
:44]
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: PermissionHandler: fileCanWrite() file write allowed : true
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: PnfsHandler : CacheException (2) : Pnfs error : failed : File exists
10/24 14:19:55 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: REPLY(reset=true GReplyType=enc): <553 Permission denied, reason:
CacheException(rc=2;msg=Pnfs error : failed : File exists)>
10/24 14:19:57 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: AbstractFtpDoorV1::run(): _pnfsEntryIncomplete=false, _path=null
10/24 14:19:57 Cell(GFTP-heplnx204-Unknown-125@gridftp-heplnx204Domain)
: EOS encountered
Looking at the log for a successful srmcp that appears to use "MODE E"
not "TYPE I" and from then the logging looks completely different. IIRC
"MODE E" is some sort of append mode so maybe that's why srmcp works.
The real problem is why pnfs thinks the file exists even when it
doesn't.
The problem appears to be the same for all VOs and target directories so
it's unlikely to be "phantom" entries in pnfs.
I guess I'll be looking in the pnfsDomain logs now.
Thanks,
Chris.
> -----Original Message-----
> From: Greig A Cowan [mailto:[log in to unmask]]
> Sent: 24 October 2006 14:16
> To: [log in to unmask]; [log in to unmask]
> Subject: RE: lcg-cr fails with "File Exists"
>
> Chris,
>
> You can increase the level of logging in the gridftp door
> logs by increasing the number in the first line of the
> gridftpdoor.batch file
>
> set printout default 2
>
> going to 3 or 4 should give you plenty of extra information.
>
> Greig
>
> On Tue, 24 Oct 2006, Brew, CAJ (Chris) wrote:
>
> > Hi Jon,
> >
> > I don't think so. The stuff from the kpwd file is the "/ /
> /" at the
> > end of the
> >
> > login <username> ...
> >
> > lines right?
> >
> > They're all set to the same and I haven't changed anything in
> > srm.batch
> > - though I cannot see anything that looks like a user root
> in that file.
> >
> > The only log entry I can find for the initial error is int the
> > gridftpdoor log and only consists of:
> >
> > 10/24 13:25:26
> > Cell(GFTP-heplnx204-Unknown-115@gridftp-heplnx204Domain)
> > : PnfsHandler : CacheException (2) : Pnfs error : failed :
> File exists
> >
> > which isn't very informative and of course lcg-cr doesn't have a
> > proper --debug option so it's very hard to tell exactly
> what it's doing.
> >
> > The other very limited errors all seem to relate to the failed
> > srm-advisory delete. Can I turn the logging up any higher?
> >
> > Thanks,
> > Chris.
> >
> > > -----Original Message-----
> > > From: bakken [mailto:[log in to unmask]]
> > > Sent: 24 October 2006 12:25
> > > To: [log in to unmask]
> > > Subject: Re: lcg-cr fails with "File Exists"
> > >
> > > My only guess is that the extra file components gridftp and srmcp
> > > can add to the filepath. For example, the dcache.kpwd file has a
> > > user root that is added, and srmcp has a different one in the srm
> > > batch file. Many people run with these both set to /, so
> they become
> > > degenerate. The gridftp component changes depending on
> the user, so
> > > your DN can make a difference.
> > >
> > > Complete filenames should be listed in the log, but it
> may be hard
> > > to find.
> > >
> > > Any chance that's it?
> > >
> > > <R4551|0n 7h3r3'5 n0 p|4c3 |1k3 127.0.0.1
> > > Jon A. Bakken [log in to unmask] (630) 840-4790
> > >
> > > On Tue, 24 Oct 2006, Brew, CAJ (Chris) wrote:
> > >
> > > > Hi All,
> > > >
> > > > I've got a problem with dcache that I've not seen before:
> > > >
> > > > lcg-cr fails with "File Exists" even though it doesn't.
> > > >
> > > > heplnx101 - ~ $ lcg-cr --vo babar -v -d
> > > >
> > >
> srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/babar/brew/te
> > > st
> > > > -2
> > > > 00610241145 file://home/hephp/brew/9936870.moose.01.root
> > > > Using grid catalog type: lfc
> > > > Using grid catalog : lfcserver.cnaf.infn.it Source URL:
> > > > file://home/hephp/brew/9936870.moose.01.root
> > > > File size: 13033805
> > > > VO name: babar
> > > > Destination specified:
> > > >
> > >
> srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/babar/brew/te
> > > st
> > > > -2
> > > > 00610241145
> > > > Destination URL for copy:
> > > >
> > >
> gsiftp://heplnx165.pp.rl.ac.uk:2811//pnfs/pp.rl.ac.uk/data/babar/bre
> > > w/
> > > > te
> > > > st-200610241145
> > > > # streams: 1
> > > > # set timeout to 0 seconds
> > > > Alias registered in Catalog:
> > > >
> > >
> lfn:/grid/babar/generated/2006-10-24/file-981c5a78-d7fd-4481-88d4-c8
> > > fa
> > > > 6a
> > > > a4998f
> > > > the server sent an error response: 553 553 Permission
> > > denied, reason:
> > > > CacheException(rc=2;msg=Pnfs error : failed : File exists)
> > > >
> > > > Copy Failed: Unregistering alias from catalog.
> > > > advisoryDelete(User [name=babar001, uid=37201, gid=24267,
> > > > root=/],pnfs/pp.rl.ac.uk/data/babar/brew/test-200610241145)
> > > Error file
> > > > does not exist, cannot delete
> > > >
> > > > I can however srmcp, or globus-url-copy the a file to
> the same path:
> > > >
> > > > heplnx101 - ~ $ srmcp -debug
> > > > file:////home/hephp/brew/9936870.moose.01.root
> > > >
> > >
> srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/dteam/brew/te
> > > st
> > > > -2
> > > > 00610241145
> > > > Storage Resource Manager (SRM) CP Client version 1.21 Copyright
> > > > (c)
> > > > 2002-2006 Fermi National Accelerator Laboratory
> > > >
> > > > SRM Configuration:
> > > > debug=true
> > > > gsissl=true
> > > > help=false
> > > > pushmode=false
> > > > userproxy=true
> > > > buffer_size=131072
> > > > tcp_buffer_size=0
> > > > streams_num=10
> > > > config_file=/home/hephp/brew/.srmconfig/config.xml
> > > > glue_mapfile=/opt/d-cache/srm/conf/SRMServerV1.map
> > > > webservice_path=srm/managerv1.wsdl
> > > > webservice_protocol=https
> > > > gsiftpclinet=globus-url-copy
> > > > protocols_list=http,gsiftp
> > > > save_config_file=null
> > > > srmcphome=/opt/d-cache/srm
> > > > urlcopy=/opt/d-cache/srm/sbin/url-copy.sh
> > > > x509_user_cert=/home/hephp/brew/.globus/usercert.pem
> > > > x509_user_key=/home/hephp/brew/.globus/userkey.pem
> > > > x509_user_proxy=/tmp/x509up_u24431
> > > >
> > > x509_user_trusted_certificates=/etc/grid-security/certificates
> > > > globus_tcp_port_range=null
> > > > gss_expected_name=null
> > > > storagetype=permanent
> > > > retry_num=20
> > > > retry_timeout=10000
> > > > wsdl_url=null
> > > > use_urlcopy_script=false
> > > > connect_to_wsdl=false
> > > > delegate=true
> > > > full_delegation=true
> > > > server_mode=passive
> > > > srm_protocol_version=1
> > > > request_lifetime=86400
> > > > from[0]=file:////home/hephp/brew/9936870.moose.01.root
> > > >
> > > >
> > >
> to=srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/dteam/brew
> > > /t
> > > > es
> > > > t-200610241145
> > > >
> > > > Tue Oct 24 12:01:21 BST 2006: starting SRMPutClient Tue Oct 24
> > > > 12:01:21 BST 2006: In SRMClient ExpectedName: host Tue Oct
> > > 24 12:01:21
> > > > BST 2006: SRMClient(https,srm/managerv1,true)
> > > > SRMClientV1 : user credentials are:
> > > > /C=UK/O=eScience/OU=CLRC/L=RAL/CN=chris dteam brew
> > > > SRMClientV1 : SRMClientV1 calling
> > > > org.globus.axis.util.Util.registerTransport()
> > > > SRMClientV1 : connecting to srm at
> > > > httpg://heplnx204.pp.rl.ac.uk:8443/srm/managerv1
> > > > Tue Oct 24 12:01:22 BST 2006: connected to server,
> > > obtaining proxy Tue
> > > > Oct 24 12:01:22 BST 2006: got proxy of type class
> > > > org.dcache.srm.client.SRMClientV1
> > > > Tue Oct 24 12:01:22 BST 2006: source file#0 :
> > > > /home/hephp/brew/9936870.moose.01.root
> > > > SRMClientV1 : put,
> > > >
> > >
> sources[0]="srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/d
> > > te
> > > > am
> > > > /brew/test-200610241145"
> > > > SRMClientV1 : put,
> > > >
> > >
> dests[0]="srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/dte
> > > am
> > > > /b
> > > > rew/test-200610241145"
> > > > SRMClientV1 : put, protocols[0]="http"
> > > > SRMClientV1 : put, protocols[1]="dcap"
> > > > SRMClientV1 : put, protocols[2]="gsiftp"
> > > > SRMClientV1 : put, contacting service
> > > > httpg://heplnx204.pp.rl.ac.uk:8443/srm/managerv1
> > > > copy_jobs is empty
> > > > Tue Oct 24 12:01:26 BST 2006: srm returned requestId =
> > > > -2146176534 Tue Oct 24 12:01:26 BST 2006: sleeping 1 seconds ...
> > > > Tue Oct 24 12:01:28 BST 2006: FileRequestStatus with
> > > >
> > >
> SURL=srm://heplnx204.pp.rl.ac.uk:8443/pnfs/pp.rl.ac.uk/data/dteam/br
> > > ew
> > > > /t
> > > > est-200610241145 is Ready
> > > > Tue Oct 24 12:01:28 BST 2006: received
> > > >
> > >
> TURL=gsiftp://heplnx165.pp.rl.ac.uk:2811//pnfs/pp.rl.ac.uk/data/dtea
> > > m/
> > > > br
> > > > ew/test-200610241145
> > > > copy_jobs is not empty
> > > > copying CopyJob, source =
> > > > file:////home/hephp/brew/9936870.moose.01.root
> > > > destination =
> > > >
> > >
> gsiftp://heplnx165.pp.rl.ac.uk:2811//pnfs/pp.rl.ac.uk/data/dteam/bre
> > > w/
> > > > te
> > > > st-200610241145
> > > > GridftpClient: memory buffer size is set to 131072
> > > > GridftpClient: connecting to heplnx165.pp.rl.ac.uk on port 2811
> > > > GridftpClient: gridFTPClient tcp buffer size is set to 0
> > > > GridftpClient: gridFTPWrite started, source file is
> > > > java.io.RandomAccessFile@12aea3e destination path is
> > > > /pnfs/pp.rl.ac.uk/data/dteam/brew/test-200610241145
> > > > GridftpClient: gridFTPWrite started, destination path is
> > > > /pnfs/pp.rl.ac.uk/data/dteam/brew/test-200610241145
> > > > GridftpClient: set local data channel authentication
> mode to None
> > > > GridftpClient: parallelism: 10
> > > > GridftpClient: adler 32 for file
> java.io.RandomAccessFile@12aea3e
> > > > is
> > > > 2932796037
> > > > GridftpClient: waiting for completion of transfer
> > > > GridftpClient: gridFtpWrite: starting the transfer in emode to
> > > > /pnfs/pp.rl.ac.uk/data/dteam/brew/test-200610241145
> > > > GridftpClient: DiskDataSink.close() called
> > > > GridftpClient: gridFTPWrite() wrote 13033805bytes
> > > > GridftpClient: closing client :
> > > > org.dcache.srm.util.GridftpClient$FnalGridFTPClient@141fab6
> > > > GridftpClient: closed client
> > > > execution of CopyJob, source =
> > > > file:////home/hephp/brew/9936870.moose.01.root destination =
> > > >
> > >
> gsiftp://heplnx165.pp.rl.ac.uk:2811//pnfs/pp.rl.ac.uk/data/dteam/bre
> > > w/
> > > > te
> > > > st-200610241145 completed
> > > > setting file request -2146176533 status to Done
> copy_jobs is empty
> > > > stopping copier
> > > >
> > > > I can't find much in the logs and have restarted
> everything I can
> > > > think of without fixing the problem.
> > > >
> > > > This is with 1.6.6
> > > >
> > > > Anyone got any ideas?
> > > >
> > > > Thanks,
> > > > Chris.
> > > >
> > >
> >
>
> --
> ==============================================================
> ==========
> Dr Greig A Cowan
> http://www.ph.ed.ac.uk/~gcowan1
> School of Physics, University of Edinburgh, James Clerk
> Maxwell Building
>
> TIER-2 STORAGE SUPPORT PAGES:
> http://wiki.gridpp.ac.uk/wiki/Grid_Storage
> ==============================================================
> ==========
>
|