Hi,
On Thu, 16 Oct 2008, Kostas Georgiou wrote:
> On Thu, Oct 16, 2008 at 02:21:21PM +0100, Stephen Childs wrote:
>
>> Kostas Georgiou wrote:
>>
>>> It would definitely help once the CE starts passing down job
>>> requirements but does anyone expect the users to bother specifying
>>> the right amount? What will happen with the pilot jobs that
obviously
>>> can not have the right requirements since you can't know at
submission
>>> time what the real job will need?
>>
>> At least if we move to queues based on memory requirements (and set
the
>> default to have a low limit) then over time users will realise they
need
>> to _ask_ for the amount of memory they need.
>
> I really hope so. We have a fast queue with a 30min limit and we keep
getting
> jobs there from users and they don't seem to notice or care about it.
I
> am not even sure that the middleware gives them an error back that
tells
> them why their jobs failed (does anyone know if this is the case?).
It doesn't, it's just another aborted jobs and since the Grid has so
many of those I guess most users don't notice.
Here we've set up memory limit based queues but don't (yet) kill jobs
going over the limit (I do send nastygrams to users who submit 2GB
jobs to the 500MB queue though).
The initial idea is to give some clues to the batch system without
breaking things. the trouble is that ~no-one puts memory requirements
in their JDL files so the RB/WMS just sprays jubs into the 500, 1000
and 2000 queues.
I'm thinking of adding enforcement at 125%-150% of advertised memory
level. I don't think users can complain too much about that.
I also only advertise single core boxes because the whole memory/core
things seems broken (at least the way people use it is not how it's
defined).
If I start to support MPI jobs I'll have to add another queue
and cluster into the BDII but I don't think there's a way of not
allowing non-MPI jobs to use it.
Yours,
Chris.
--
Chris A. Brew ([log in to unmask]) +44 (0)1235 446326 .oO000Oo.
Particle Physics Department, Computing Group. \\ //
Bldg. R1 2.57, RAL, Chilton, Didcot, OXON. OX11 0QX \\ //
The opinions expressed above are not necessarily those \o/
of my employer, my family, my friends or in fact me. X
--
Scanned by iCritical for STFC.
|