Hi all,
We are running LCG-2.6.0 (we upgraded from LCG-2.4.0 on the 5th of
September) and we encountered a problem with our RB:
A user, mapped to the pool account biomed007, submitted a job through our
RB on 01 Sep, and the job was concluded successfully on 02 Sep. When he
tried today to get the output of his job, he got a '550 550 not a plain
file.' error message:
*************************************************************
BOOKKEEPING INFORMATION:
Status info for the Job : https://rb.isabella.grnet.gr:9000/N4aoruFXKRJ7PjAsBtPO_g
Current Status: Done (Success)
Exit code: 0
Status Reason: Job terminated successfully
Destination: fal-pygrid-18.lancs.ac.uk:2119/jobmanager-lcgpbs-biomed
reached on: Fri Sep 2 02:58:08 2005
*************************************************************
edg-job-get-output --dir autodock_1lee_h3_lga_200001-499918-chembridge
https://rb.isabella.grnet.gr:9000/N4aoruFXKRJ7PjAsBtPO_g
Retrieving files from host: rb.isabella.grnet.gr ( for https://rb.isabella.grnet.gr:9000/N4aoruFXKRJ7PjAsBtPO_g )
error: the server sent an error response: 550 550 /var/edgwl/SandboxDir/N4/https_3a_2f_2frb.isabella.grnet.gr_3a9000_2fN4aoruFXKRJ7PjAsBtPO_5fg/output/autodock307.out: not a plain file.
Looking at the sandbox dir of his job, we discovered that the 'input'
and 'output' directories, and their contents, were owned not by
'biomed007' but by 'edguser' instead.
According to /var/log/lcg-expiregridmapdir.log, this pool
account hadn't expired recently (and in any case, we have an expiration
time so long that this shouldn't be the problem).
Using 'stat' to check the modification and change times of the directories, we see the
following:
[root@rb https_3a_2f_2frb.isabella.grnet.gr_3a9000_2fNaq45o_5fJdEaWokaMoz9oEQ]# ls -l
total 20
-rw-rw---- 1 edguser edguser 20 Jul 9 23:21 Maradona.output
drwxrwx--- 2 edguser edguser 4096 Jul 9 22:00 input
drwxrwx--- 2 edguser edguser 4096 Jul 9 23:21 output
-rw------- 1 edguser edguser 4120 Jul 9 22:00 user.proxy
[root@rb https_3a_2f_2frb.isabella.grnet.gr_3a9000_2fNaq45o_5fJdEaWokaMoz9oEQ]# stat input output
File: `input'
Size: 4096 Blocks: 8 IO Block: 4096 Directory
Device: 803h/2051d Inode: 3686472 Links: 2
Access: (0770/drwxrwx---) Uid: ( 995/ edguser) Gid: ( 995/ edguser)
Access: 2005-09-08 04:02:16.000000000 +0300
Modify: 2005-07-09 22:00:57.000000000 +0300
Change: 2005-09-05 15:59:23.000000000 +0300
File: `output'
Size: 4096 Blocks: 8 IO Block: 4096 Directory
Device: 803h/2051d Inode: 3686473 Links: 2
Access: (0770/drwxrwx---) Uid: ( 995/ edguser) Gid: ( 995/ edguser)
Access: 2005-09-08 04:02:16.000000000 +0300
Modify: 2005-07-09 23:21:49.000000000 +0300
Change: 2005-09-05 15:59:23.000000000 +0300
It might be a coincindence, but the ctimes of the 'input' and 'output'
directories correspond roughly to the time we upgraded our RB to
LCG-2.6.0, using YAIM.
Now our question is how/why did this happen... Any thoughts?
Thanks...
--
Kyriakos Ginis, PhD Candidate
Software Engineering Laboratory
National Technical University of Athens
|