JISCMail - LCG-ROLLOUT Archives

Yo

There are at least three problems trying to be solved here, which is 
responsible for a lot of the confusion.  it would be very good to try to 
continue this discussion in a way that makes it clear which problem(s) 
is being addressed by the proposed solution or comment.

A. if the USER specifies some REQUIREMENTS at SUBMIT TIME, how can we 
have the GRID LAYER pass these down to the LRMS layer??

A concrete example: user specifies something like

      other.GlueCEPolicyMaxCPUTime > 30

in the JDL.  When the job lands on a site and is submitted to the LRMS 
by the grid layer, the LRMS would be told that the job requires no more 
than 30 minutes of CPU time, for example by doing

       qsub -l cput=30:00 job_script

or perhaps doing a bare submit and using 'qalter' to tell it about the 
cputime requirement.

B. if the SITE has a mix of WORKER NODES, how can we specify the 
different classes of WNs to the outside world, instead of advertising 
that all our machines have the characteristics of our 'worst' WN?

C. if we are able to deal with B, but the WNs of different classes are 
behind the same gatekeeper, what do we do when an incoming job wants a 
high-class WN, but all our free slots are on low-class WNs?

It seems to me that C is what David is trying to address.  I would 
prefer to solve the important problems first, C seems to be at least a 
second-order problem if not third order.

I intepreted the original call for comments to be about A above, not B 
or C.  The question was for A, what else can you think of besides 'cput' 
that you'd want passed from the grid layer into the LRMS layer.

	J "but I could be wrong" T



David Rebatto wrote:
> Burke, S (Stephen) ha scritto:
> 
>> LHC Computer Grid - Rollout  
>>
>>> [mailto:[log in to unmask]] On Behalf Of Charles Loomis
>>>   
>>
>> said:
>>  
>>
>>>> 1) Keep the current syntax, allow matching against multiple     
>>>
>>> subclusters,
>>>   
>>>
>>>> and pass the subcluster name to the batch system.
>>>>     
>>>
>>> This is not a solution to the problem.
>>>   
>>
>>
>> It's a solution to part of the problem, i.e. that currently jobs may
>> avoid sites even if only 1 WN out of 500 doesn't match the requirement,
>> because you have to publish the most restrictive limit.
>>
>>  
>>
>>> I would instead opt for a hybrid approach of 2) and 3).  Allow people 
>>> to define parameters like in 3) and have whatever processes the final 
>>> JDL combine those with any explicit requirements to arrive at the 
>>> full expression.  Only those limits given separately would be passed 
>>> to the local batch system.
>>>   
>>
>>
>> Yes, that doesn't sound too bad - but in itself it wouldn't solve the
>> above problem, so you might still want to think about doing subclusters
>> properly, or else changing the glue schema to go back to max/min values.
>>
>> Stephen
>>
>>  
>>
> 
> Hi,
> my idea was more or less the option 3) proposed by Stephen. But, as he 
> said, this doesn't solve the underusage problem created by having the 
> min values published in the GRIS. Anyway, if we go with a max/min 
> schema, or if we publish only max values, we have to face another 
> problem: how do we handle jobs dispatched to a CE when there are no free 
> nodes matching the requirements (e.g. because they are busy)?
> The CE could reject it and the WMS retry mechanism would submit it 
> somewhere else, but this sounds very inefficient.
> Another solution would be that the CE keeps the job queued until a 
> suitable node is free, but this would kill any hope for the WMS to make 
> any intelligent decision, as this additional queueing time would not be 
> visible in the glue schema.
> A third option could be the implementation of a direct WMS <-> CE 
> negotiation before the actual job submission. This would require some 
> completely new code on both WMS and CE, and would be even heavier than 
> the simple check with the CE GRIS which LCG wanted removed from the 
> brokers...
> 
> Sorry if you already discussed this problem, I've tried to read all the 
> messages in the thread but I could still have missed something...
> 
> Cheers,
> David
>