> -----Original Message-----
> From: Testbed Support for GridPP member institutes [mailto:TB-
> [log in to unmask]] On Behalf Of Stephen Burke
>
> Testbed Support for GridPP member institutes
> > [mailto:[log in to unmask]] On Behalf Of Andrew McNab said:
> > The definition should be able to take account of, for example,
> > scenarios like having more job slots than cores but less job slots
> > than cores*hyperthreads.
>
> But you still aren't saying what you think the definition should be ...
> from a practical POV APEL has a job time and a benchmark, and all it can
> do is multiply them.
>
This is why, sooner or later, we're going to have to move to a
'useful work accomplished' system rather than an 'effort expended'
one.
The one thing that more or less saves the current system is the
uselessness of the lcg-CE. With no proper propagation of the
properties of individual jobs we have to assume that everything
wants the same 50Gb of scratch, 2gb of RAM, and one CPU core unit
of hardware. Once we get that fixed we'll routinely be able to turn
on things like hyperthreading and turbo mode and power saving
states and load nodes up with as many jobs as they can actually
handle to squeeze the best possible throughput out of them. Our
current system casually wastes a lot of computational resources
and we're going to have to stop doing that. When we do the current
time multiplied by benchmark accounting is going to fall completely
to pieces.
It would be a good idea to have its replacement at least vaguely
penciled in before that happens.
Ewan
|