I have basically played with the advice in https://wiki.egi.eu/wiki/Tools/Manuals/SiteProblemsFollowUp item TS81 (Workload management) at :
https://wiki.egi.eu/wiki/Tools/Manuals/TS81 (Error = Cannot take token)
according to TS81 - ( which I do not understand much and request some explanation) ,
I add to my simple jdl program the statement : ShallowRetryCount = -1
With this statement added , I run again :
glite-wms-job-submit -e https://wms-ce.haifa.il.ibm.com:7443/glite_wms_wmproxy_server -r eladby-temp.haifa.il.ibm.com:8443/cream-pbs-kzvo -a ex.jdl
get back : submitted with success
(All following commands are run from the UI)
Then I get for 'glite-wms-job-status' status :
-----------------------------
======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:
Current Status: Running
Status Reason: unavailable
Destination: eladby-temp.haifa.il.ibm.com:8443/cream-pbs-kzvo
Submitted: Thu Jun 20 20:26:20 2013 IDT
==========================================================================
Then:
[dubi@ui ~]$ glite-wms-job-logging-info -v 2 https://wms-ce.haifa.il.ibm.com:9000/m6beChpPi9-8-n0UrKzDgA
===================== glite-wms-job-logging-info Success =====================
LOGGING INFORMATION:
Printing info for the Job : https://wms-ce.haifa.il.ibm.com:9000/m6beChpPi9-8-n0UrKzDgA
---
Event: RegJob
- Arrived = Thu Jun 20 20:26:20 2013 IDT
- Host = wms-ce.haifa.il.ibm.com
- Jobtype = SIMPLE
- Ns = https://wms-ce.haifa.il.ibm.com:7443/glite_wms_wmproxy_server
- Nsubjobs = 0
- Source = NetworkServer
- Src instance = https://wms-ce.haifa.il.ibm.com:7443/glite_wms_wmproxy_server
- Timestamp = Thu Jun 20 20:26:20 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
---
Event: Accepted
- Arrived = Thu Jun 20 20:26:20 2013 IDT
- From = UserInterface
- From host = NetworkServer
- From instance = ui.haifa.il.ibm.com
- Host = wms-ce.haifa.il.ibm.com
- Source = NetworkServer
- Src instance = https://wms-ce.haifa.il.ibm.com:7443/glite_wms_wmproxy_server
- Timestamp = Thu Jun 20 20:26:20 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
---
Event: EnQueued
- Arrived = Thu Jun 20 20:26:20 2013 IDT
- Host = wms-ce.haifa.il.ibm.com
- Queue = /var/workload_manager/jobdir
- Result = START
- Source = NetworkServer
- Src instance = https://wms-ce.haifa.il.ibm.com:7443/glite_wms_wmproxy_server
- Timestamp = Thu Jun 20 20:26:20 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
---
Event: EnQueued
- Arrived = Thu Jun 20 20:26:20 2013 IDT
- Host = wms-ce.haifa.il.ibm.com
- Queue = /var/workload_manager/jobdir
- Result = OK
- Source = NetworkServer
- Src instance = https://wms-ce.haifa.il.ibm.com:7443/glite_wms_wmproxy_server
- Timestamp = Thu Jun 20 20:26:20 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
---
Event: DeQueued
- Arrived = Thu Jun 20 20:26:21 2013 IDT
- Host = wms-ce.haifa.il.ibm.com
- Queue = /var/workload_manager/jobdir
- Source = WorkloadManager
- Src instance = 8300
- Timestamp = Thu Jun 20 20:26:21 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
---
Event: Match
- Arrived = Thu Jun 20 20:26:21 2013 IDT
- Dest id = eladby-temp.haifa.il.ibm.com:8443/cream-pbs-kzvo
- Host = wms-ce.haifa.il.ibm.com
- Source = WorkloadManager
- Src instance = 8300
- Timestamp = Thu Jun 20 20:26:21 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
---
Event: UserTag
- Arrived = Thu Jun 20 20:26:21 2013 IDT
- Host = wms-ce.haifa.il.ibm.com
- Name = CEInfoHostName
- Source = WorkloadManager
- Src instance = 8300
- Timestamp = Thu Jun 20 20:26:21 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
- Value = eladby-temp.haifa.il.ibm.com
---
Event: EnQueued
- Arrived = Thu Jun 20 20:26:21 2013 IDT
- Host = wms-ce.haifa.il.ibm.com
- Queue = /var/ice/jobdir
- Result = START
- Source = WorkloadManager
- Src instance = 8300
- Timestamp = Thu Jun 20 20:26:21 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
---
Event: EnQueued
- Arrived = Thu Jun 20 20:26:21 2013 IDT
- Host = wms-ce.haifa.il.ibm.com
- Queue = /var/ice/jobdir
- Result = OK
- Source = WorkloadManager
- Src instance = 8300
- Timestamp = Thu Jun 20 20:26:21 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
---
Event: DeQueued
- Arrived = Thu Jun 20 20:26:21 2013 IDT
- Host = wms-ce.haifa.il.ibm.com
- Local jobid = https://wms-ce.haifa.il.ibm.com:9000/m6beChpPi9-8-n0UrKzDgA
- Queue = /var/ice/jobdir
- Source = JobController
- Timestamp = Thu Jun 20 20:26:21 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
---
Event: Transfer
- Arrived = Thu Jun 20 20:26:21 2013 IDT
- Dest host = https://eladby-temp.haifa.il.ibm.com:8443/ce-cream/services/CREAM2
- Dest instance = unavailable
- Dest jobid = unavailable
- Destination = LRMS
- Host = wms-ce.haifa.il.ibm.com
- Reason = unavailable
- Result = START
- Source = LogMonitor
- Timestamp = Thu Jun 20 20:26:21 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
---
Event: Running
- Arrived = Thu Jun 20 20:26:30 2013 IDT
- Host = matlab.haifa.il.ibm.com
- Node = matlab.haifa.il.ibm.com
- Source = LRMS
- Timestamp = Thu Jun 20 20:26:30 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
---
Event: ReallyRunning
- Arrived = Thu Jun 20 20:26:30 2013 IDT
- Host = matlab.haifa.il.ibm.com
- Source = LRMS
- Timestamp = Thu Jun 20 20:26:30 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
---
Event: Done
- Arrived = Thu Jun 20 20:26:33 2013 IDT
- Exit code = 499467184
- Host = matlab.haifa.il.ibm.com
- Reason = job completed
- Source = LRMS
- Status code = OK
- Timestamp = Thu Jun 20 20:26:32 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
---
Event: Transfer
- Arrived = Thu Jun 20 20:26:26 2013 IDT
- Dest host = https://eladby-temp.haifa.il.ibm.com:8443/ce-cream/services/CREAM2
- Dest instance = unavailable
- Dest jobid = https://eladby-temp.haifa.il.ibm.com:8443/CREAM267019922
- Destination = LRMS
- Host = wms-ce.haifa.il.ibm.com
- Reason = unavailable
- Result = OK
- Source = LogMonitor
- Timestamp = Thu Jun 20 20:26:26 2013 IDT
- User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
===================================
seems OK
Then I also check the CE status by the above - Dest jobid , and get back:
-----------------------------------------------------------------------------------------------
[dubi@ui ~]$ glite-ce-job-status https://eladby-temp.haifa.il.ibm.com:8443/CREAM267019922
****** JobID=[https://eladby-temp.haifa.il.ibm.com:8443/CREAM267019922]
Status = [DONE-OK]
ExitCode = [0]
Then I try( WMS query):
-------------------------------
[dubi@ui ~]$ glite-wms-job-output --dir /home/dubi/result https://wms-ce.haifa.il.ibm.com:9000/m6beChpPi9-8-n0UrKzDgA
Connecting to the service https://wms-ce.haifa.il.ibm.com:7443/glite_wms_wmproxy_server
================================================================================
JOB GET OUTPUT OUTCOME
No output files to be retrieved for the job:
https://wms-ce.haifa.il.ibm.com:9000/m6beChpPi9-8-n0UrKzDgA
================================================================================
So no output (yet) . although a trivial /bin/hostname command was submitted
and I try a CE level query for output and get :
---------------------------------------------
[dubi@ui ~]$ glite-ce-job-output https://eladby-temp.haifa.il.ibm.com:8443/CREAM267019922
2013-06-20 20:28:50,407 INFO - For JobID [https://eladby-temp.haifa.il.ibm.com:8443/CREAM267019922] output will be stored in the dir ./eladby-temp.haifa.il.ibm.com_8443_CREAM267019922
No match for *
2013-06-20 20:28:50,770 ERROR - UBERFTP ERROR OUTPUT: 220 eladby-temp.haifa.il.ibm.com GridFTP Server 6.19 (gcc64, 1359994843-83) [Globus Toolkit 5.2.3] ready.
230 User kzvo001 logged in.
Using 1 parallel data chanels for extended block transfers
Any idea why there is no output for the simple /bin/hostname ?
Note that running as originally the glite-wms-job-submit without the 'ShallowRetryCount = -1 '
statement ( i.e with default retrycount=10) - the wms job status returns - after Running for a while :
======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:
Status info for the Job : https://wms-ce.haifa.il.ibm.com:9000/Lg8vRE9FoXiF0SrOAIB-BA
Current Status: Aborted
Logged Reason(s):
- Cannot take token
- Cannot take token
- Cannot take token
- Cannot take token; reason=1; Failed to init security context GSS Major Status: Authentication Failed GSS Minor Status Error Chain: globus_gsi_gssapi: SSLv3 handshake problems OpenSSL Error: s3_clnt.c:915: in library: SSL routines, function SSL3_GET_SERVER_CERTIFICATE: certificate verify failed globus_gsi_callback_module: Could not verify credential globus_gsi_callback_module: Can't get the local trusted CA certificate: Untrusted self-signed certificate in chain with hash ce630362 Failed to init security context GSS Major Status: Authentication Failed GSS Minor Status Error Chain: globus_gsi_gssapi: SSLv3 handshake problems OpenSSL Error: s3_clnt.c:915: in library: SSL routines, function SSL3_GET_SERVER_CERTIFICATE: certificate verify failed globus_gsi_callback_module: Could not verify credential globus_gsi_callback_module: Can't get the local trusted CA certificate: Untrusted self-signed certificate in chain with hash ce630362 Failed to init security context GSS Major Status: Authentication Failed GSS Minor Status Error Chain: globus_gsi_gssapi: SSLv3 handshake problems OpenSSL Error: s3_clnt.c:915: in library: SSL routines, function SSL3_GET_SERVER_CERTIFICATE: certificate verify failed globus_gsi_callback_module: Could not verify credential globus_gsi_callback_module: Can't get the local trusted CA certificate: Untrusted self-signed certificate in chain with hash ce630362 Cannot take token
This is what triggered me to look at TS81
|