On 04/04/2011 13:35, Ewan MacMahon wrote:
>> -----Original Message-----
>> From: Testbed Support for GridPP member institutes [mailto:TB-
>>
>> That was the initial idea, but the HEPSPEC06 figure depends on more than
>> the CPU model and MHz (eg the kernel version) so it would need to be at
>> least per-site, and really per subcluster. That just gets us back to the
>> CE-queue mapping, which is something Steve's script can do without having
>> to modify what ATLAS makes available via the dashboard jobsummary.
>
> The per resource mapping is going to be a worse approximation than the
> cpuid is
Clearly that's not true for many sites (eg ones that have identical
machines behind a particular CE) and given we're talking about average
HEPSPEC06 of the machines in a particular subcluster averaged over
hundreds or thousands of jobs then the statistical fluctations arising
from which machines jobs happen to land on are washed out.
However, the errors from the cpuid to HEPSPEC06 mapping are systematic
biases: they always go the same way for a particular machine whose
HEPSPEC06 is under- or over-estimated because it's faster or slower
kernel etc isn't taken into account. That means that site always either
gets less or more than it deserves. Great for the winners; bad for the
losers.
> We did raise the possibility of putting each nodes' HEPSPEC06 in an
> environment variable from whence the pilot could grab it directly -
> would that be better?
Realistically, how quickly can ATLAS (and the other experiments) add
support for recording this and make it available to Steve's script? That
was the motivation for the comment "and it doesn't need ATLAS to change
anything" that was made during the discussion after Steve's slides.
Cheers,
Andrew
--------------------------------------------------------------
Dr Andrew McNab, High Energy Physics, University of Manchester
|