At QM we nominally have 4 production cream CEs which should / will support all VOs
We have about 5000 queued (almost all atlas) jobs of which 1000 are 8 core atlas and about 3000 running jobs slots.
At present 500 LHCb jobs are running with 33 pending and 250 CMS jobs running the rest is atlas. We have a reasonable flow of LHCb and CMS jobs. They get a fair share of about 10% each of the cluster resources so get a constant throughput of jobs.
If atlas jobs dry up both other VO are seen to scale up quickly to fill the slack.
dan
* Dr Daniel Traynor, Grid cluster system manager
* Tel +44(0)20 7882 6560, Particle Physics,QMUL
________________________________________
From: Testbed Support for GridPP member institutes <[log in to unmask]> on behalf of Marcus Ebert <[log in to unmask]>
Sent: 08 September 2016 11:47
To: [log in to unmask]
Subject: Re: reported pending jobs when running LHCb and ATLAS jobs on the same site
On Thu, 8 Sep 2016, Love, Peter wrote:
> This queue was configured in our test factories but no longer needed
> there. I've removed it so it should drop to 200 pending. I'd suggest
Thanks Peter!
> LHCb only query their jobs and not let atlas jobs affect their
> submission rate.
>
Well, as far as I understand, LHCb is not actively querying pending jobs
but uses what the ARC CE reports globally , which reports right now the
number of pending Grid jobs in our case.
I guess if ATLAS has own mechanisms to get the number of pending jobs and
doesn't care about what the ARC CE reports for pending jobs, then it could
be changed to report only LHCb jobs?
Or we could set up separate CEs for LHCb and ATLAS.
But before putting in any new setup or config change, it would be good to
get some information about how other sites handle and solved this before.
Cheers,
Marcus
> Cheers,
> Peter
>
>
>> On 8 Sep 2016, at 11:29, Marcus Ebert <[log in to unmask]> wrote:
>>
>> Hi Peter,
>>
>> It is the new SL7 queue.
>> http://apfmon.lancs.ac.uk/q/UKI-SCOTGRID-ECDF_SL7
>>
>> There are 400 jobs pending in the queue right now.
>>
>> Cheers,
>> Marcus
>>
>> On Thu, 8 Sep 2016, Love, Peter wrote:
>>
>>> Marcus, which queue are you referring to? http://apfmon.lancs.ac.uk/query/ecdf/
>>>
>>> We do in fact throttle based on pending jobs but I'll check the details.
>>>
>>> Cheers,
>>> Peter
>>>
>>>> On 8 Sep 2016, at 11:00, Marcus Ebert <[log in to unmask]> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I have a question for sites running ATLAS and LHCb jobs. What we see at ECDF right now is that there are no new LHCb jobs submitted to our site because of the large number of pending ATLAS jobs. It seems LHCb only submits new pilots when the number of pending jobs is below a threshold (for ECDF it was 10 pending pilots, now it's 30). However, ATLAS submits jobs at a rate that the number of pending jobs is in the hundreds. And since the total number of pending jobs is that large, no new LHCb jobs get submitted.
>>>> Right now, we have one ARC CE which submits jobs for both VOs.
>>>>
>>>> Could other sites please let me know how they support LHCb and ATLAS at the same time and how they report pending/running/total number of jobs through an ARC CE that they still get LHCB jobs?
>>>>
>>>>
>>>> Cheers,
>>>> Marcus
>>>>
>>>> --
>>>> The University of Edinburgh is a charitable body, registered in
>>>> Scotland, with registration number SC005336.
>>>
>>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
|