Yo,
Dennis van Dok's results for torque / maui are here:
http://wiki.nikhef.nl/grid/Passing_job_requirements_through_the_WMS
AFAICT there is a degree of arbitrariness in which (and how) requirements are translated from the JDL to the batch system. My opinion is that some small team should sit down and decide what the desired behavior is, and then work on implementing this behavior on each of the batch systems. It would be really great if this same behavior is reproduced when passing requirements to an ARC CE or a GRAM 5 CE.
The coordination sounds like a job for EGI, however I think the input to this coordination has to come from a discussion between sites and the users; sites should drive the discussion, as the sites are the parties that understand the use cases from multiple user communities. Since the WLCG multicore TF is already working on related issues, we can probably find at least a subset of the right group of people there. Thomas is one of them!
JT
On Feb 26, 2014, at 19:18 , Thomas Hartmann <[log in to unmask]> wrote:
> Hi Goncalo and Maarten,
>
> after some more testing (and debugging of my script...) we were able to
> get CERequirements getting piped to SGE, e.g.
>
> CERequirements = "other.GlueHostMainMemoryVirtualSize > 2000";
> or other.GlueCEPolicyMaxWallClockTime > 30
>
> So far, we tested successfully the CPUTime, WallTime, Mem and VMem
> parameters passing from the jdl to the finally submitted SGE wrapper/job.
>
> Do you know, if there are further experiences for SGE and other batch
> systems how the implementation/support is for the various parameters?
>
> Cheers and many thanks for your help,
> Thomas
>
>
> On 21.02.2014 17:45, Gonçalo Borges wrote:
>> Hi Thomas
>>
>> Unfortunately, the sge_local_submit_attributes.sh is still not general
>> enough, and
>> I think that CERequirements functionality is not available there.
>>
>> However, for your precise example, MaxObtainableWallClockTime will be
>> implemented
>> if you set it up as a Requirement and not as a CERequirement. From the
>> sge_local_submit_attributes.sh script:
>>
>> ---*---
>>
>> # We take the same approach as before; testing for an exact match as well
>> # as a lower bound.
>> maxwall=0
>> if [ -n "$GlueCEPolicyMaxWallClockTime_Min" ]; then
>> maxwall="$GlueCEPolicyMaxWallClockTime_Min"
>> elif [ -n "$GlueCEPolicyMaxWallClockTime" ] ; then
>> maxwall="$GlueCEPolicyMaxWallClockTime"
>> fi
>>
>> maxobtain=0
>> if [ -n "$GlueCEPolicyMaxWallClockTime_Min" ]; then
>> maxobtain="$GlueCEPolicyMaxWallClockTime_Min"
>> elif [ -n "$GlueCEPolicyMaxWallClockTime" ]; then
>> maxobtain="$GlueCEPolicyMaxWallClockTime"
>> fi
>>
>> # Select the larger of the two values; the use of -gt uses
>> # integer parsing so it effectively enforces input sanitation.
>> if [ $maxobtain -gt $maxwall ]; then
>> walltime=$maxobtain
>> else
>> walltime=$maxwall
>> fi
>>
>> if [ $walltime -gt 0 ]; then
>> # The time unit, according to the Glue schema, is one minute
>> # but the Torque parameter is in seconds
>> wallsec=$(($walltime*60))
>> echo "#$ -l s_rt=${wallsec}"
>> fi
>>
>> ---*---
>>
>> Can you confirm that?
>> Cheers
>> Goncalo
>>
>>
>> On 02/21/2014 03:17 PM, Thomas Hartmann wrote:
>>> Hi all,
>>>
>>> I am trying to understand why resource requirements are not properly
>>> forwarded from the jdl to our SGE.
>>>
>>> E.g., I submit my test job requesting 4 job slots and limiting the wall
>>> time with
>>> CpuNumber = 3;
>>> CERequirements = "other.GlueCEPolicyMaxWallClockTime<3900";
>>>
>>> the actual reparsed batch job submission script requests the number of
>>> cores but the further requirements are missing, i.e., the qstat
>>> parameters are (requesting 3 job slots via the -pe parallel environment
>>> switch) (*)
>>>
>>> I suppose the resource constraints in SGE would in principle be handled
>>> with the "-l" switch for complex configs.
>>>
>>> However, I do not find any clue for the CERequirements values already in
>>> the job wrapper, while it contains the job slot request "__nodes=3" (**)
>>>
>>> I guess that the translation to the SGE submission parameters would be
>>> parsed in principle by /usr/libexec/sge_local_submit_attributes.sh
>>>
>>> Thus, I am looking for the point, where the CeRequirements get lost or
>>> why they do not end up in the job wrapper?
>>>
>>> CE version is
>>>> glite-ce-service-info --version
>>> CREAM User Interface version 1.2.0
>>>
>>> Cheers and thanks for ideas,
>>> Thomas
>>>
>>> (*)
>>> # SGE directives:
>>> #$ -S /bin/bash
>>> #$ -q sl6
>>> #$ -pe * 3
>>> #$ -v
>>> [log in to unmask]:/var/cream_sandbox/dteam/CN_Thomas_Hartmann_OU_KIT_O_GermanGrid_C_DE_dteam_Role_NULL_Capability_NULL_dteam024/82/CREAM820173573/CREAM820173573_jobWrapper.sh@@@[log in to unmask]:/var/cream_sandbox/dteam/CN_Thomas_Hartmann_OU_KIT_O_GermanGrid_C_DE_dteam_Role_NULL_Capability_NULL_dteam024/proxy/697bb7d9c3e3044803acfe673df1616283275a55_10899012909176
>>>
>>> #$ -v
>>> [log in to unmask]:/var/cream_sandbox/dteam/CN_Thomas_Hartmann_OU_KIT_O_GermanGrid_C_DE_dteam_Role_NULL_Capability_NULL_dteam024/82/CREAM820173573/StandardOutput@@@[log in to unmask]:/var/cream_sandbox/dteam/CN_Thomas_Hartmann_OU_KIT_O_GermanGrid_C_DE_dteam_Role_NULL_Capability_NULL_dteam024/82/CREAM820173573/StandardError
>>>
>>> #$ -m n
>>>
>>> (**)
>>> #!/bin/sh -l
>>> __create_subdir=1
>>> export CE_ID=cream-ge-1-kit.gridka.de:8443/cream-sge-sl6
>>> export
>>> __delegationProxyCertSandboxPath=gsiftp://cream-ge-1-kit.gridka.de/var/cream_sandbox/dteam/CN_Thomas_Hartmann_OU_KIT_O_GermanGrid_C_DE_dteam_Role_NULL_Capability_NULL_dteam024/proxy/697bb7d9c3e3044803acfe673df1616283275a55_10899012909176
>>>
>>> export
>>> __delegationProxyCertSandboxPathTmp=/tmp/697bb7d9c3e3044803acfe673df1616283275a55_10899012909176820173573
>>>
>>> export GRID_JOBID=N/A
>>> export CREAM_JOBID=https://cream-ge-1-kit.gridka.de:8443/CREAM820173573
>>> __brokerinfo=.BrokerInfo
>>> __vo=dteam
>>> __gridjobid=N/A
>>> __creamjobid=CREAM820173573
>>> __executable=/grid/fzk.de/home/thart/MCore/workload.sh
>>> __working_directory=CREAM820173573
>>> __wms_hostname=
>>> __ce_hostname=cream-ge-1-kit.gridka.de
>>> __stdout_file="myjob.out"
>>> __stderr_file="myjob.err"
>>> __cmd_line="\"/grid/fzk.de/home/thart/MCore/workload.sh\" \$* >
>>> \"myjob.out\" 2> \"myjob.err\""
>>> __logger_dest=10.97.104.101:49152
>>> __token_file=""
>>> __token_hostname=""
>>> __token_fullpath=""
>>> __nodes=3
>>> export __delegationTimeSlot=3600
>>> export __copy_proxy_min_retry_wait=60
>>> __copy_retry_count_isb=2
>>> __copy_retry_first_wait_isb=60
>>> __copy_retry_count_osb=6
>>> __copy_retry_first_wait_osb=300
>>> declare -a __environment
>>>
>>>
>>>
>>> __max_osb_size=-1
>>> declare -a __output_file
>>> declare -a __output_transfer_cmd
>>> declare -a __output_file_dest
>>>
>>> __gsiftp_dest_uri=gsiftp://cream-ge-1-kit.gridka.de/var/cream_sandbox/dteam/CN_Thomas_Hartmann_OU_KIT_O_GermanGrid_C_DE_dteam_Role_NULL_Capability_NULL_dteam024/82/CREAM820173573/OSB/
>>>
>>> __output_file[0]="\${workdir}/myjob.out"
>>> __output_transfer_cmd[0]="\${globus_transfer_cmd}"
>>> __output_file[1]="\${workdir}/myjob.err"
>>> __output_transfer_cmd[1]="\${globus_transfer_cmd}"
>>>
>>>
>>> __output_data=0
>>> ...
>>
>>
|