Print

Print


Hmmmm... I just tried and for every single WMS (Imperial, RAL, Spain) I get:


Logged Reason(s):
    - Transfer to CREAM failed due to exception: Failed to create a
delegation id for job
https://wmslb02.grid.hep.ph.ic.ac.uk:9000/W5nVc7c1_9FkkXQHRpqV6g: reason is
Received NULL fault; the error is due to another cause:
FaultString=[storeLimitedDelegationProxy error
[id='13560033922E387736wms022Egrid2Ehep2Eph2Eic2Eac2Euk'; rfc=false;
dn='CN_daniela_bauer_L_Physics_OU_Imperial_O_eScience_C_UK';
localUser='t2k004'; vo='t2k.org'; startTime='12/20/12 11:31 AM (GMT)';
expirationTime='12/21/12 11:32 AM (GMT)'];: sudo: sorry, you must have a
tty to run sudo


t2k004 is my t2k ID at Lancaster.

Could you have words with your sudoers file ?

Cheers,




On 20 December 2012 11:32, Matt Doidge <[log in to unmask]> wrote:

> Hello again,
>
>
>  I'll get back to Jon P to see if he sees any changes, along with Ewan's
>> suggestion of trying direct job submission.
>>
>
> Jon did a lot of investigating last night into the t2k problems at
> Lancaster, and found that direct job submission to our CE worked, as well
> as WMS submission from some WMSes. Other failed with the same failure mode (
> https://ggus.eu/ws/ticket_**info.php?ticket=88628<https://ggus.eu/ws/ticket_info.php?ticket=88628>
> ).
>
> The successful WMSi were:
> lcgwms02 & lcgwms03.gridpp.ac.uk at RAL
>
> The unsuccessful WMS jobs went through:
> wms01.grid.hep.ph.ic.ac.uk &  wms02.grid.hep.ph.ic.ac.uk (and  lcgwms04
> at RAL).
>
> The failed jobs all seemed to abort with the same long, authenticationy
> looking error message detailed previously: "error: globus_ftp_client: the
> server responded with an error500 500 .... Unable to open file ...Cannot
> move ISB".
>
> I don't know if there's anything significant that these WMS have in
> common? It could be that the gubbins that control the interaction between
> abaddon.hec.lancs.ac.uk and these WMS has got into a bad state.
>
> Any ideas appreciated!
> Thanks,
> Matt
>
>
>
>  Cheers,
>> Matt
>> P.S. Chris will be pleased that I had a bash at fixing ngs.ac.uk access
>> on our CE :-)
>>
>> On 12/19/2012 11:07 AM, Christopher J. Walker wrote:
>>
>>> On 19/12/12 10:56, Daniela Bauer wrote:
>>>
>>>> Hi Matt,
>>>>
>>>> check your CE:
>>>>
>>>> lx05:~ :~] voms-proxy-init -valid 24:00 --voms t2k.org <http://t2k.org>
>>>> Enter GRID pass phrase:
>>>> Your identity: /C=UK/O=eScience/OU=Imperial/**L=Physics/CN=daniela
>>>> bauer
>>>> Creating temporary proxy
>>>> ..............................**..............................**......
>>>> Done
>>>> Contacting voms.gridpp.ac.uk:15003 <http://voms.gridpp.ac.uk:**15003<http://voms.gridpp.ac.uk:15003>
>>>> >
>>>> [/C=UK/O=eScience/OU=**Manchester/L=HEP/CN=voms.**gridpp.ac.uk<http://voms.gridpp.ac.uk>
>>>> <http://voms.gridpp.ac.uk>] "t2k.org <http://t2k.org>" Done
>>>> Creating proxy
>>>> ..............................**..............................**
>>>> ..............................**..............................
>>>>
>>>>
>>>> Done
>>>> Your proxy is valid until Thu Dec 20 10:51:18 2012
>>>>
>>>> lx05:~ :~] uberftp abaddon.hec.lancs.ac.uk
>>>> <http://abaddon.hec.lancs.ac.**uk <http://abaddon.hec.lancs.ac.uk>>
>>>> 220 abaddon.hec.lancs.ac.uk <http://abaddon.hec.lancs.ac.**uk<http://abaddon.hec.lancs.ac.uk>>
>>>> GridFTP
>>>> Server 6.10 (gcc64, 1334324800-83) [Globus Toolkit 5.2.0] ready.
>>>> 530-Login incorrect. : globus_gss_assist: Error invoking callout
>>>> 530-globus_callout_module: The callout returned an error
>>>> 530-an unknown error occurred
>>>> 530 End.
>>>>
>>>> But:
>>>> lx05:~ :~] voms-proxy-init -valid 24:00 --voms dteam
>>>> Enter GRID pass phrase:
>>>> Your identity: /C=UK/O=eScience/OU=Imperial/**L=Physics/CN=daniela
>>>> bauer
>>>> Creating temporary proxy ..............................**......... Done
>>>> Contacting voms2.hellasgrid.gr:15004 <http://voms2.hellasgrid.gr:**
>>>> 15004 <http://voms2.hellasgrid.gr:15004>>
>>>> [/C=GR/O=HellasGrid/OU=hellasg**rid.gr/CN=voms2.hellasgrid.gr<http://hellasgrid.gr/CN=voms2.hellasgrid.gr>
>>>> <http://hellasgrid.gr/CN=**voms2.hellasgrid.gr<http://hellasgrid.gr/CN=voms2.hellasgrid.gr>>]
>>>> "dteam" Failed
>>>>
>>>> Error: Error during SSL handshake:
>>>>
>>>> Trying next server for dteam.
>>>> Creating temporary proxy ....................... Done
>>>> Contacting voms.hellasgrid.gr:15004 <http://voms.hellasgrid.gr:**15004<http://voms.hellasgrid.gr:15004>
>>>> >
>>>> [/C=GR/O=HellasGrid/OU=hellasg**rid.gr/CN=voms.hellasgrid.gr<http://hellasgrid.gr/CN=voms.hellasgrid.gr>
>>>> <http://hellasgrid.gr/CN=voms.**hellasgrid.gr<http://hellasgrid.gr/CN=voms.hellasgrid.gr>>]
>>>> "dteam" Done
>>>> Creating proxy .......................... Done
>>>> Your proxy is valid until Thu Dec 20 10:52:02 2012
>>>>
>>>>
>>>> lx05:~ :~] uberftp abaddon.hec.lancs.ac.uk
>>>> <http://abaddon.hec.lancs.ac.**uk <http://abaddon.hec.lancs.ac.uk>>
>>>> 220 abaddon.hec.lancs.ac.uk <http://abaddon.hec.lancs.ac.**uk<http://abaddon.hec.lancs.ac.uk>>
>>>> GridFTP
>>>> Server 6.10 (gcc64, 1334324800-83) [Globus Toolkit 5.2.0] ready.
>>>> 230 User dteam167 logged in.
>>>>
>>>> The content of vomsdir looks fine to me, but obviously I can't see any
>>>> of your more subtle configuration issues (are you using Argus?)
>>>>
>>>>
>>> Trying as ngs.ac.uk, that doesn't work either (and you might as well fix
>>> it while you are fiddling)
>>>
>>> walker@heppc300:~/grid/ngs$ uberftp abaddon.hec.lancs.ac.uk
>>> 220 abaddon.hec.lancs.ac.uk GridFTP Server 6.10 (gcc64, 1334324800-83)
>>> [Globus Toolkit 5.2.0] ready.
>>> 530-Login incorrect. : globus_gss_assist: Error invoking callout
>>> 530-globus_callout_module: The callout returned an error
>>> 530-an unknown error occurred
>>> 530 End.
>>> walker@heppc300:~/grid/ngs$ voms-proxy-info --all
>>> subject : /C=UK/O=eScience/OU=**QueenMaryLondon/L=Physics/CN=**
>>> christopher
>>> walker/CN=proxy
>>> issuer : /C=UK/O=eScience/OU=**QueenMaryLondon/L=Physics/CN=**
>>> christopher
>>> walker
>>> identity : /C=UK/O=eScience/OU=**QueenMaryLondon/L=Physics/CN=**
>>> christopher
>>> walker
>>> type : proxy
>>> strength : 1024 bits
>>> path : /tmp/x509up_u32184
>>> timeleft : 11:31:39
>>> === VO ngs.ac.uk extension information ===
>>> VO : ngs.ac.uk
>>> subject : /C=UK/O=eScience/OU=**QueenMaryLondon/L=Physics/CN=**
>>> christopher
>>> walker
>>> issuer : /C=UK/O=eScience/OU=**Manchester/L=HEP/CN=voms.**gridpp.ac.uk<http://voms.gridpp.ac.uk>
>>> attribute : /ngs.ac.uk/Role=NULL/**Capability=NULL<http://ngs.ac.uk/Role=NULL/Capability=NULL>
>>> timeleft : 11:31:39
>>> uri : voms.gridpp.ac.uk:15010
>>>
>>>
>>> Looking at your that machine:
>>>
>>> walker@heppc300:~/grid/ngs$ uberftp abaddon.hec.lancs.ac.uk
>>> 220 abaddon.hec.lancs.ac.uk GridFTP Server 6.10 (gcc64, 1334324800-83)
>>> [Globus Toolkit 5.2.0] ready.
>>> 230 User dteam116 logged in.
>>> uberftp> cd /etc/grid-security/vomsdir/ngs**.ac.uk <http://ngs.ac.uk>
>>> uberftp> ls
>>> -rw-r--r-- 1 root root 64 Sep 13 13:00 15010.lsc
>>> -rw-r--r-- 1 root root 148 Sep 13 13:25 voms.ngs.ac.uk.lsc
>>> uberftp> cat 15010.lsc
>>> ngs.ac.uk
>>> /C=UK/O=eScienceCA/OU=**Authority/CN=UK e-Science CA 2B
>>>
>>>
>>> I suspect you are missing a field in your site-info.def (or have an
>>> extra one) for the ngs VO.
>>>
>>> No, this doesn't help with t2k.org I'm afraid.
>>>
>>> Chris
>>>
>>>  Cheers,
>>>> Daniela
>>>>
>>>>
>>>> On 19 December 2012 10:47, Matt Doidge <[log in to unmask]
>>>> <mailto:[log in to unmask]**uk <[log in to unmask]>>> wrote:
>>>>
>>>> Hello all, I hope I caught some of you before you headed off for the
>>>> holidays!
>>>>
>>>> Lancaster has been trying to get T2K working on our clusters, and on
>>>> our occasionally quirky shared cluster T2K are consistently failing
>>>> to successfully submit jobs via the WMS (well technically submission
>>>> works, the jobs get aborted), with an incredibly verbose error
>>>> message (replicated below, you can also see it in the ticket
>>>> https://ggus.eu/ws/ticket_ info.php?ticket=88628
>>>> <https://ggus.eu/ws/ticket_**info.php?ticket=88628<https://ggus.eu/ws/ticket_info.php?ticket=88628>
>>>> >).
>>>>
>>>> The error message looks like either an authentication, permissions
>>>> or missing destination problem - but I've checked our CE and
>>>> everything seems okay. As a test I asked Jon to uberftp into our CE,
>>>> and he did so without problem as an sgmt2k user.
>>>>
>>>> I'm a little stuck, and would appreciate someone who speaks
>>>> glite-wms-job-status error message to take a look and maybe pinpoint
>>>> where in the chain things are breaking. I've learnt the hard way
>>>> that the CREAM/WMS interaction is quite complex, and I'm wondering
>>>> if this is one of the cases where this has screwed up (the two have
>>>> become "out of sync" somehow).
>>>>
>>>> Thanks in advance, and Merry Christmas!
>>>> Matt
>>>>
>>>> ======================= glite-wms-job-status Success
>>>> =====================
>>>> BOOKKEEPING INFORMATION:
>>>>
>>>> Status info for the Job : https://lcglb04.gridpp.rl.ac.
>>>> uk:9000/31MHBsdtFOj7AFEY7rs-lg
>>>> <https://lcglb04.gridpp.rl.ac.**uk:9000/31MHBsdtFOj7AFEY7rs-lg<https://lcglb04.gridpp.rl.ac.uk:9000/31MHBsdtFOj7AFEY7rs-lg>
>>>> **>
>>>> Current Status: Aborted
>>>> Logged Reason(s):
>>>> - Cannot move ISB (retry_copy ${globus_transfer_cmd}
>>>> gsiftp://lcgwms02.gridpp.rl. ac.uk:2811/var/SandboxDir/31/
>>>> https_3a_2f_2flcglb04.gridpp. rl.ac.uk_3a9000_
>>>> 2f31MHBsdtFOj7AFEY7rs-lg/ input/pexpect.py
>>>> <http://lcgwms02.gridpp.rl.ac.**uk:2811/var/SandboxDir/31/**
>>>> https_3a_2f_2flcglb04.gridpp.**rl.ac.uk_3a9000_**
>>>> 2f31MHBsdtFOj7AFEY7rs-lg/**input/pexpect.py<http://lcgwms02.gridpp.rl.ac.uk:2811/var/SandboxDir/31/https_3a_2f_2flcglb04.gridpp.rl.ac.uk_3a9000_2f31MHBsdtFOj7AFEY7rs-lg/input/pexpect.py>
>>>> >
>>>>
>>>>
>>>> file:///home/grid/sgmt2k005/ home_cream_174147165/
>>>> CREAM174147165/pexpect.py): error: globus_ftp_client: the server
>>>> responded with an error500 500-Command failed. :
>>>> globus_l_gfs_file_open failed.500-globus_xio: Unable to open file
>>>> /var/SandboxDir/31/https_3a_ 2f_2flcglb04.gridpp.rl.ac.uk_
>>>> 3a9000_2f31MHBsdtFOj7AFEY7rs- lg/input/pexpect.py500-globus_ xio:
>>>> System error in open: No such file or directory500-globus_xio: A
>>>> system call failed: No such file or directory500 End.; reason=1;
>>>> open /home/grid/sgmt2k005/home_ cream_174147165/.ssh/id_rsa failed:
>>>> No such file or directory. /usr/shared_apps/admin/etc/
>>>> profile.d/keygen2: line 17: /home/grid/sgmt2k005/home_
>>>> cream_174147165/.ssh/ authorized_keys: No such file or directory
>>>> chmod: cannot access `/home/grid/sgmt2k005/home_
>>>> cream_174147165/.ssh/ authorized_keys': No such file or directory
>>>> /opt/glite/glite/bin/glite-lb- logevent: edg_wll_LogEvent*(): LB
>>>> server (bkserver,lbproxy) store protocol error (edg_wll_LogEvent():
>>>> LB server (bkserver,lbproxy) store protocol error;; Logging library
>>>> ERROR: LB server (bkserver,lbproxy) store protocol error;;
>>>> edg_wll_DoLogEvent(): edg_wll_log_connect error DNS resolver error;;
>>>> edg_wll_gss_connect();; GSS Error: Unknown host)
>>>> /opt/glite/glite/bin/glite-lb- logevent: edg_wll_LogEvent*(): LB
>>>> server (bkserver,lbproxy) store protocol error (edg_wll_LogEvent():
>>>> LB server (bkserver,lbproxy) store protocol error;; Logging library
>>>> ERROR: LB server (bkserver,lbproxy) store protocol error;;
>>>> edg_wll_DoLogEvent(): edg_wll_log_connect error DNS resolver error;;
>>>> edg_wll_gss_connect();; GSS Error: Unknown host) Cannot move ISB
>>>> (retry_copy ${globus_transfer_cmd} gsiftp://lcgwms02.gridpp.rl.
>>>> ac.uk:2811/var/SandboxDir/31/ https_3a_2f_2flcglb04.gridpp.
>>>> rl.ac.uk_3a9000_ 2f31MHBsdtFOj7AFEY7rs-lg/ input/pexpect.py
>>>> <http://lcgwms02.gridpp.rl.ac.**uk:2811/var/SandboxDir/31/**
>>>> https_3a_2f_2flcglb04.gridpp.**rl.ac.uk_3a9000_**
>>>> 2f31MHBsdtFOj7AFEY7rs-lg/**input/pexpect.py<http://lcgwms02.gridpp.rl.ac.uk:2811/var/SandboxDir/31/https_3a_2f_2flcglb04.gridpp.rl.ac.uk_3a9000_2f31MHBsdtFOj7AFEY7rs-lg/input/pexpect.py>
>>>> >
>>>>
>>>>
>>>> file:///home/grid/sgmt2k005/ home_cream_174147165/
>>>> CREAM174147165/pexpect.py): error: globus_ftp_client: the server
>>>> responded with an error 500 500-Command failed. :
>>>> globus_l_gfs_file_open failed. 500-globus_xio: Unable to open file
>>>> /var/SandboxDir/31/https_3a_ 2f_2flcglb04.gridpp.rl.ac.uk_
>>>> 3a9000_2f31MHBsdtFOj7AFEY7rs- lg/input/pexpect.py 500-globus_xio:
>>>> System error in open: No such file or directory 500-globus_xio: A
>>>> system call failed: No such file or directory 500 End.
>>>> - Transfer to CREAM failed due to exception: CREAM Register raised
>>>> std::exception N5glite2ce16cream_client_ api16cream_
>>>> exceptions30JobSubmissionDisab ledExceptionE
>>>> Status Reason: hit job shallow retry count (10)
>>>> Destination: abaddon.hec.lancs.ac.uk:8443/ cream-lsf-hex
>>>> <http://abaddon.hec.lancs.ac.**uk:8443/cream-lsf-hex<http://abaddon.hec.lancs.ac.uk:8443/cream-lsf-hex>
>>>> >
>>>> Submitted: Wed Dec 5 15:36:25 2012 GMT
>>>> ============================== ==============================
>>>> ==============
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Sent from the pit of despair
>>>>
>>>> ------------------------------**-----------------------------
>>>> [log in to unmask] <mailto:daniela.bauer@**imperial.ac.uk<[log in to unmask]>
>>>> >
>>>> HEP Group/Physics Dep
>>>> Imperial College
>>>> Tel: +44-(0)20-75947810
>>>> http://www.hep.ph.ic.ac.uk/~**dbauer/<http://www.hep.ph.ic.ac.uk/~dbauer/>
>>>>
>>>


-- 
Sent from the pit of despair

-----------------------------------------------------------
[log in to unmask]
HEP Group/Physics Dep
Imperial College
Tel: +44-(0)20-75947810
http://www.hep.ph.ic.ac.uk/~dbauer/