Hi Zvi,
> 1. My simple program has OutputSandbox . Here it is
>
> [
> Executable = "/bin/hostname";
> StdOutput = "myjob.out";
> StdError = "myjob.err";
> OutputSandbox = {"myjob.out", "myjob.err"};
> OutputSandboxBaseDestURI = "gsiftp://localhost";
> ]
That is a JDL file for CREAM, you cannot just use it for a WMS job!
The JDL syntaxes for CREAM and for the WMS are very similar (on purpose),
but not exactly interchangeable. The problem is with the last attribute:
the given value is recognized by _CREAM_ to mean that the output sandbox
is to be stored on the CREAM CE, whereas the _WMS_ will take it literally,
i.e. try to store the output files on localhost (the worker node),
which fails because there is no GridFTP server there,
and which anyway is not wanted at all.
This is a valid WMS JDL example:
----------------------------------------------------------------------
Type = "Job";
JobType = "Normal";
Executable = "/bin/hostname";
StdOutput = "hello.out";
StdError = "hello.err";
InputSandbox = {"/etc/group"};
OutputSandbox = {"hello.out","hello.err"};
RetryCount = 0;
ShallowRetryCount = 0;
----------------------------------------------------------------------
> 2. If I import the WMS CA issuer certificates to the Worker nodes the situation is much the same - cannot get a result
That you do not get any output is understood from the JDL.
If the necessary CA files are present on the worker node,
you should at least no longer get "Cannot take token".
> ( The CE issuer certificates are already in the worker nodes /etc/grid-security/certificates).
OK.
> BTW: I have SE defined in the CE (it is on the nfs ). WMS sees the same SE
What do you mean with that? An SE is another grid service that should have
no connection to a WMS and as far as the CE is concerned, the info system
(BDII) should publish that they are close to each other (at the same site),
but they should not share anything. The WMS could make use of an NFS server
e.g. for its sandbox area, but that is a different story.
> [...]
>
> looking at WMS server : /var/log/ams/ice.log shows:
> ------------------------------------------------------------------
>
> 2013-06-21 23:51:28,802 ERROR - IceLBContext::setLoggingJob - Unable to set logging job to jobid=[https://wms-ce.haifa.il.ibm.com:9000/v2oUoNRUPM13Wn01Yofklw]. Proxy file [] does not exist. Trying to use the host proxy cert, and hoping for the best...
Let's see if that also happens for jobs with more reasonable JDL files.
> At /var/log/wms/httpd-wmproxy-errors.log:
> ------------------------------------------------------
> [Sat Jun 22 00:00:51 2013] [error] Certificate Verification: Error (24): invalid CA certificate
> [Sat Jun 22 00:00:51 2013] [error] Certificate Verification: Error (26): unsupported certificate purpose
Do not worry about those errors in _that_ logfile: we also get them...
> at this stage :
> ------------------
>
>
> [dubi@ui ~]$ glite-wms-job-logging-info -v 2 https://wms-ce.haifa.il.ibm.com:9000/WrjftBvwBu0XIZqebhnxng
>
> [...]
> ---
> Event: Running
> - Arrived = Fri Jun 21 23:55:33 2013 IDT
> - Host = wms-ce.haifa.il.ibm.com
> - Node = matlab
> - Source = LogMonitor
> - Timestamp = Fri Jun 21 23:55:33 2013 IDT
> - User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
> ---
> Event: ReallyRunning
> - Arrived = Fri Jun 21 23:55:33 2013 IDT
> - Host = wms-ce.haifa.il.ibm.com
> - Source = LogMonitor
> - Timestamp = Fri Jun 21 23:55:33 2013 IDT
> - User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
> - Wn seq = UI=000000:NS=0000000004:WM=000005:BH=0000000000:JSS=000002:LM=000002:LRMS=000000:APP=000000:LBS=000000
> ---
> Event: Done
> - Arrived = Fri Jun 21 23:55:33 2013 IDT
> - Exit code = 0
> - Host = wms-ce.haifa.il.ibm.com
> - Reason = Job Terminated Successfully
> - Source = LogMonitor
> - Status code = OK
> - Timestamp = Fri Jun 21 23:55:33 2013 IDT
> - User = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
> ---
Good, the LogMonitor got the whole picture!
The ReallyRunning state is logged after successful download of the
input sandbox (if any) and (unless disabled) the removal of the token.
|