I will check the possible involvement of cron in files permits change at the CE
Now another issue related to submitting a job by the WMS:
---------------------------------------------------------------------------
glite-wms-job-submit fails on match making
Detailed description:
Our wms is configured and we can now submit jobs from UI specifying the CE by the -r option . The job runs fine and returns result .
But when we let the wms decide on the CE to submit (we have only 1 CE in my site) by omitting the -r option e.g :
glite-wms-job-submit -a <jdl file>
the job is submitted successfully but the Current status (when inquiring by ) is always : Waiting
e.g
======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:
Status info for the Job : https://wms-ce.haifa.il.ibm.com:9000/zPgxvyWDC9TZ6E9Nekw8bw
Current Status: Waiting
Status Reason: BrokerHelper: no compatible resources
Submitted: Mon Jun 24 17:23:21 2013 IDT
==========================================================================
Also when we run glite-wms-job-list-match we get :
[dubi@ui ~]$ glite-wms-job-list-match -a <jdl file>
Connecting to the service https://wms-ce.haifa.il.ibm.com:7443/glite_wms_wmproxy_server
==================== glite-wms-job-list-match failure ====================
No Computing Element matching your job requirements has been found!
==========================================================
We were considering the following link: https://wiki.egi.eu/wiki/Tools/Manuals/TS53 and think none of the Diagnosis items apply to us (or we do not understand it well)
The CE name was propagated to the top bdii
we verify it by : ldapsearch -LLL -x -h ngi-il-bdii1.isragrid.org.il -p 2170 -b o=grid | grep eladby | grep CE
( ngi-il-bdii1.isragrid.org.il is the top bdii)
which shows lines like (where eladby-temp.haifa.il.ibm.com is the hostname of the CE) :
dn: GlueCEUniqueID=eladby-temp.haifa.il.ibm.com:8443/cream-pbs-my.test-queue,M
GlueCEInfoHostName: eladby-temp.haifa.il.ibm.com
Any advice ?
in fact the following status for the submitted job is given when we ask for a higher verbosity level (wms hostname is wms-ce.haifa.il.ibm.com) :
[dubi@ui ~]$ glite-wms-job-status --verbosity 3 https://wms-ce.haifa.il.ibm.com:9000/zPgxvyWDC9TZ6E9Nekw8bw
======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:
Status info for the Job : https://wms-ce.haifa.il.ibm.com:9000/zPgxvyWDC9TZ6E9Nekw8bw
Current Status: Waiting
Status Reason: BrokerHelper: no compatible resources
Submitted: Mon Jun 24 17:23:21 2013 IDT
---
- Cancelling = 0
- Children num = 0
- Condor job exit status = 0
- Condor job pid = 0
- Condor shadow exit status = 0
- Condor shadow pid = 0
- Condor starter exit status = 0
- Condor starter pid = 0
- Cputime = -1
- Done code = -1
- Expectupdate = 0
- Jobtype = 0
- Lastupdatetime = Mon Jun 24 19:23:22 2013 IDT
- Location = WorkloadManager/wms-ce.haifa.il.ibm.com/1681
- Network server = https://wms-ce.haifa.il.ibm.com:7443/glite_wms_wmproxy_server
- Owner = /DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]
- Payload running = 0
- Pbs exit status = 0
- Pbs pid = 0
- Resubmitted = 0
- Stateentertime = Mon Jun 24 17:23:21 2013 IDT
- Subjob failed = 0
- Suspended = 0
- Remove from proxy = 0
- Ui host = NetworkServer
- Sandbox retrieved = 0
- Jw status = 0
- Cream cancelling = 0
- Cream cpu time = 0
- Cream done code = -1
- Cream exit code = -1
- Cream jw status = -1
- Cream state = -1
- Ft sandbox type = -1
---
- Children hist = 0
Undefined=0
Submitted=0
Waiting=0
Ready=0
Scheduled=0
Running=0
Done=0
Cleared=0
Aborted=0
Cancelled=0
Unknown=0
Purged=0
- Jdl =
[
OutputSandboxPath = "/var/SandboxDir/zP/https_3a_2f_2fwms-ce.haifa.il.ibm.com_3a9000_2fzPgxvyWDC9TZ6E9Nekw8bw/output";
StdOutput = "hello.out";
ShallowRetryCount = 0;
SignificantAttributes = { "Requirements","Rank","FuzzyRank" };
RetryCount = 0;
Executable = "/bin/hostname";
Type = "job";
LB_sequence_code = "UI=000000:NS=0000000004:WM=000000:BH=0000000000:JSS=000000:LM=000000:LRMS=000000:APP=000000:LBS=000000";
AllowZippedISB = true;
VirtualOrganisation = "kzvo.isragrid.org.il";
JobType = "normal";
DefaultRank = -other.GlueCEStateEstimatedResponseTime;
OutputSandboxDestURI = { "gsiftp://wms-ce.haifa.il.ibm.com:2811/var/SandboxDir/zP/https_3a_2f_2fwms-ce.haifa.il.ibm.com_3a9000_2fzPgxvyWDC9TZ6E9Nekw8bw/output/hello.out","gsiftp://wms-ce.haifa.il.ibm.com:2811/var/SandboxDir/zP/https_3a_2f_2fwms-ce.haifa.il.ibm.com_3a9000_2fzPgxvyWDC9TZ6E9Nekw8bw/output/hello.err" };
OutputSandbox = { "hello.out","hello.err" };
edg_jobid = "https://wms-ce.haifa.il.ibm.com:9000/zPgxvyWDC9TZ6E9Nekw8bw";
VOMS_FQAN = "/kzvo.isragrid.org.il/Role=NULL/Capability=NULL";
CertificateSubject = "/DC=org/DC=terena/DC=tcs/C=IL/O=IUCC/CN=Zvi Dubitzki [log in to unmask]";
StdError = "hello.err";
InputSandboxPath = "/var/SandboxDir/zP/https_3a_2f_2fwms-ce.haifa.il.ibm.com_3a9000_2fzPgxvyWDC9TZ6E9Nekw8bw/input";
rank = -other.GlueCEStateEstimatedResponseTime;
X509UserProxy = "/var/SandboxDir/zP/https_3a_2f_2fwms-ce.haifa.il.ibm.com_3a9000_2fzPgxvyWDC9TZ6E9Nekw8bw/user.proxy";
requirements = ( other.GlueCEStateStatus == "Production" ) && ( ( ( ShortDeadlineJob is true ? RegExp(".*sdj$",other.GlueCEUniqueID) : !RegExp(".*sdj$",other.GlueCEUniqueID) ) && ( other.GlueCEPolicyMaxTotalJobs == 0 || other.GlueCEStateTotalJobs < other.GlueCEPolicyMaxTotalJobs ) && ( EnableWmsFeedback is true ? RegExp("cream",other.GlueCEImplementationName,"i") : true ) && ( member(CertificateSubject,other.GlueCEAccessControlBaseRule) || member(strcat("VO:",VirtualOrganisation),other.GlueCEAccessControlBaseRule) || FQANmember(strcat("VOMS:",VOMS_FQAN),other.GlueCEAccessControlBaseRule) ) is true && FQANmember(strcat("DENY:",VOMS_FQAN),other.GlueCEAccessControlBaseRule) isnt true && ( IsUndefined(other.OutputSE) || member(other.OutputSE,other.GlueCESEBindGroupSEUniqueID) ) ) );
WMPInputSandboxBaseURI = "gsiftp://wms-ce.haifa.il.ibm.com:2811/var/SandboxDir/zP/https_3a_2f_2fwms-ce.haifa.il.ibm.com_3a9000_2fzPgxvyWDC9TZ6E9Nekw8bw"
]
- Stateentertimes =
Submitted : Mon Jun 24 17:23:21 2013 IDT
Waiting : Mon Jun 24 17:23:21 2013 IDT
Ready : ---
Scheduled : ---
Running : ---
Done : ---
Cleared : ---
Aborted : ---
Cancelled : ---
Unknown : ---
==========================================================================
|