JISCMail - LCG-ROLLOUT Archives

Hi,

MAXPROC indeed means no more than 160.  We tend to adjust these things 
depending on what's happening on the farm, like if we notice that things 
are essentially empty except for one group.  We wind up adjusting it 
roughly once every two weeks.  Not too bad.

				JT


Dan Schrager wrote:
> Does MAXPROC mean that you won't be able to run more than 160 atlas jobs 
> ? Even if the rest of the farm is empty ?
> And does it mean that if 230 simulation jobs would come in the same time 
> (no free slot left) the next SFT dteam job will run 48 hours later ?
> 
> I would still keep one dteam reserved node.
> 
> If having (almost) all dteam jobs run on the same node is not good, then 
> add a minus sign after the reserved class name:
> 
> SRCFG[dteam]            PERIOD=INFINITY HOSTLIST=eio99.pp.weizmann.ac.il 
> CLASSLIST=dteam-
> 
> This way all nodes will be selected for dteam jobs and the reserved node 
> will be used just as a last resort -- but (almost) always there.
> 
> 
> 
> 
> 
> Jeff Templon wrote:
> 
>> Yo
>>
>> Thinking about this, i tentatively conclude that it's a bad idea to 
>> dedicate a single worker node to dteam jobs.  The reason is that this 
>> WN may not be representative of your whole farm.  We've seen often 
>> enough in the past that some worker nodes are fine while others have 
>> problems.
>>
>> Go for fair shares.
>>
>> A problem worker node will eat jobs, thus there is a reasonable chance 
>> that if it is open to dteam, it will eat a dteam job too ... which is 
>> what you want to happen.  If you have a node dedicated to dteam jobs, 
>> its utilization will likely be lower than the rest of your farm, so 
>> things that die under stress will not die as quickly on this node ... 
>> you get the picture.
>>
>> Something else: smaller sites should be careful about making long 
>> queues.  In the best case, the number of jobs you should expect to be 
>> ending in any period t will only be
>>
>>     N * t / T
>>
>> where N is the number of jobs you have running, and T is how long 
>> these jobs run on average.  This assumes these N jobs have all started 
>> at random times during the last period T (not before, since they would 
>> have by definition already finished, and not after, since then they 
>> would not have started yet ;-)
>>
>> 10 CPUs, ten minutes waiting for a job to end, 24-hour jobs ... expect 
>> 0.07 jobs to end in this period ... in other words you should expect 
>> on average a slot open up every two hours or so.  In reality it will 
>> be worse since jobs tend to come in batches.
>>
>>     J "Friday night Grid Philosophy" T
>>
>>
>> Jeff Templon wrote:
>>
>>> yo,
>>>
>>> we use process caps.  here is an abbreviated example:
>>>
>>> GROUPCFG[dteam]     FSTARGET=2   PRIORITY=5000   MAXPROC=32
>>> GROUPCFG[alice]     FSTARGET=15  PRIORITY=100    MAXPROC=100 ADEF=lhc
>>> GROUPCFG[atlas]     FSTARGET=50  PRIORITY=100    MAXPROC=160 ADEF=lhc
>>> GROUPCFG[atlsgm]    FSTARGET=50  PRIORITY=100    MAXPROC=160 ADEF=lhc
>>> GROUPCFG[lhcb]      FSTARGET=35  PRIORITY=100    MAXPROC=230 ADEF=lhc
>>> GROUPCFG[lhcbsgm]   FSTARGET=35  PRIORITY=100    MAXPROC=230 ADEF=lhc
>>> GROUPCFG[cms]       FSTARGET=1-  PRIORITY=1      MAXPROC=10  ADEF=lhc
>>>
>>> GROUPCFG[esr]       FSTARGET=5   PRIORITY=50     MAXPROC=32  ADEF=nlgrid
>>> GROUPCFG[ncf]       FSTARGET=40  PRIORITY=100    MAXPROC=120 ADEF=nlgrid
>>> GROUPCFG[asci]      FSTARGET=40  PRIORITY=100    MAXPROC=120 ADEF=nlgrid
>>> GROUPCFG[pvier]     FSTARGET=5   PRIORITY=100    MAXPROC=12  ADEF=nlgrid
>>>
>>>
>>> ACCOUNTCFG[lhc]         FSTARGET=50                         MAXPROC=230
>>> ACCOUNTCFG[nlgrid]      FSTARGET=50                         MAXPROC=110
>>>
>>> Note that we give dteam a very high priority but a very low fair 
>>> share and a rather severe process cap.  On the other hand, the LHC 
>>> groups all have a rather high fair share, and are limited to 230 
>>> processes in total.  Right now we have 246 CPUs in the farm, so it is 
>>> impossible for just LHC to take all our CPUs.  Sometimes they are all 
>>> full, but this is during times that we have e.g. 180 LHC jobs 
>>> running, 50 from biomed,
>>> and 16 from dzero.  But in most cases we are not full, so dteam jobs 
>>> run immediately.
>>>
>>> Even when we are full it's not a problem.  For a big site, being full 
>>> isn't so bad because with lots of jobs, you have a relatively large 
>>> number of jobs ending during any given time period.
>>>
>>>                     JT
>>>
>>> Mario David wrote:
>>>
>>>> Hi Dan
>>>> how do you set a WN only to dteam with pbs/maui?
>>>>
>>>> we are having problems because all nodes are full of atlas and cms jobs
>>>> and dteam sft doesn't enter. Despite fairshares in maui.conf
>>>> in the past I had tried to set specific nodes to specific groups in 
>>>> the qmgr but was not successfull.
>>>>
>>>> cheers
>>>>
>>>> Mario
>>>>
>>>> Quoting Dan Schrager <[log in to unmask]>:
>>>>
>>>>
>>>>> Dear Christine,
>>>>>
>>>>> I have deleted your simulation(?) job run as user dteam at my site 
>>>>> because it was blocking the unique WN reserved for short dteam (SFT 
>>>>> kind) jobs.
>>>>> Use in the future an atlas certificate for such purposes.
>>>>>
>>>>> Regards,
>>>>> Dan
>>>>>
>>>>>
>