Hello all, I hope I caught some of you before you headed off for the
holidays!
Lancaster has been trying to get T2K working on our clusters, and on our
occasionally quirky shared cluster T2K are consistently failing to
successfully submit jobs via the WMS (well technically submission works,
the jobs get aborted), with an incredibly verbose error message
(replicated below, you can also see it in the ticket
https://ggus.eu/ws/ticket_info.php?ticket=88628).
The error message looks like either an authentication, permissions or
missing destination problem - but I've checked our CE and everything
seems okay. As a test I asked Jon to uberftp into our CE, and he did so
without problem as an sgmt2k user.
I'm a little stuck, and would appreciate someone who speaks
glite-wms-job-status error message to take a look and maybe pinpoint
where in the chain things are breaking. I've learnt the hard way that
the CREAM/WMS interaction is quite complex, and I'm wondering if this is
one of the cases where this has screwed up (the two have become "out of
sync" somehow).
Thanks in advance, and Merry Christmas!
Matt
======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:
Status info for the Job :
https://lcglb04.gridpp.rl.ac.uk:9000/31MHBsdtFOj7AFEY7rs-lg
Current Status: Aborted
Logged Reason(s):
- Cannot move ISB (retry_copy ${globus_transfer_cmd}
gsiftp://lcgwms02.gridpp.rl.ac.uk:2811/var/SandboxDir/31/https_3a_2f_2flcglb04.gridpp.rl.ac.uk_3a9000_2f31MHBsdtFOj7AFEY7rs-lg/input/pexpect.py
file:///home/grid/sgmt2k005/home_cream_174147165/CREAM174147165/pexpect.py):
error: globus_ftp_client: the server responded with an error500
500-Command failed. : globus_l_gfs_file_open failed.500-globus_xio:
Unable to open file
/var/SandboxDir/31/https_3a_2f_2flcglb04.gridpp.rl.ac.uk_3a9000_2f31MHBsdtFOj7AFEY7rs-lg/input/pexpect.py500-globus_xio:
System error in open: No such file or directory500-globus_xio: A system
call failed: No such file or directory500 End.; reason=1; open
/home/grid/sgmt2k005/home_cream_174147165/.ssh/id_rsa failed: No such
file or directory. /usr/shared_apps/admin/etc/profile.d/keygen2: line
17: /home/grid/sgmt2k005/home_cream_174147165/.ssh/authorized_keys: No
such file or directory chmod: cannot access
`/home/grid/sgmt2k005/home_cream_174147165/.ssh/authorized_keys': No
such file or directory /opt/glite/glite/bin/glite-lb-logevent:
edg_wll_LogEvent*(): LB server (bkserver,lbproxy) store protocol error
(edg_wll_LogEvent(): LB server (bkserver,lbproxy) store protocol error;;
Logging library ERROR: LB server (bkserver,lbproxy) store protocol
error;; edg_wll_DoLogEvent(): edg_wll_log_connect error DNS resolver
error;; edg_wll_gss_connect();; GSS Error: Unknown host)
/opt/glite/glite/bin/glite-lb-logevent: edg_wll_LogEvent*(): LB server
(bkserver,lbproxy) store protocol error (edg_wll_LogEvent(): LB server
(bkserver,lbproxy) store protocol error;; Logging library ERROR: LB
server (bkserver,lbproxy) store protocol error;; edg_wll_DoLogEvent():
edg_wll_log_connect error DNS resolver error;; edg_wll_gss_connect();;
GSS Error: Unknown host) Cannot move ISB (retry_copy
${globus_transfer_cmd}
gsiftp://lcgwms02.gridpp.rl.ac.uk:2811/var/SandboxDir/31/https_3a_2f_2flcglb04.gridpp.rl.ac.uk_3a9000_2f31MHBsdtFOj7AFEY7rs-lg/input/pexpect.py
file:///home/grid/sgmt2k005/home_cream_174147165/CREAM174147165/pexpect.py):
error: globus_ftp_client: the server responded with an error 500
500-Command failed. : globus_l_gfs_file_open failed. 500-globus_xio:
Unable to open file
/var/SandboxDir/31/https_3a_2f_2flcglb04.gridpp.rl.ac.uk_3a9000_2f31MHBsdtFOj7AFEY7rs-lg/input/pexpect.py
500-globus_xio: System error in open: No such file or directory
500-globus_xio: A system call failed: No such file or directory 500 End.
- Transfer to CREAM failed due to exception: CREAM Register raised
std::exception
N5glite2ce16cream_client_api16cream_exceptions30JobSubmissionDisabledExceptionE
Status Reason: hit job shallow retry count (10)
Destination: abaddon.hec.lancs.ac.uk:8443/cream-lsf-hex
Submitted: Wed Dec 5 15:36:25 2012 GMT
=========================================================================
|