JISCMail - TB-SUPPORT Archives

On 4 April 2011 14:51, Andrew McNab <[log in to unmask]> wrote:
> On 04/04/2011 13:35, Ewan MacMahon wrote:
>>>
>>> -----Original Message-----
>>> From: Testbed Support for GridPP member institutes [mailto:TB-
>>>
>>> That was the initial idea, but the HEPSPEC06 figure depends on more than
>>> the CPU model and MHz (eg the kernel version) so it would need to be at
>>> least per-site, and really per subcluster. That just gets us back to the
>>> CE-queue mapping, which is something Steve's script can do without having
>>> to modify what ATLAS makes available via the dashboard jobsummary.
>>
>> The per resource mapping is going to be a worse approximation than the
>> cpuid is
>
> Clearly that's not true for many sites (eg ones that have identical machines
> behind a particular CE) and given we're talking about average HEPSPEC06 of
> the machines in a particular subcluster averaged over hundreds or thousands
> of jobs then the statistical fluctations arising from which machines jobs
> happen to land on are washed out.
>
> However, the errors from the cpuid to HEPSPEC06 mapping are systematic
> biases: they always go the same way for a particular machine whose HEPSPEC06
> is under- or over-estimated because it's faster or slower kernel etc isn't
> taken into account. That means that site always either gets less or more
> than it deserves. Great for the winners; bad for the losers.
>

So, again... why can't we do a mapping of the tuple (SITE_ID, cpuid)
-> HEPSPEC06 ?
Both pieces of the input tuple are recorded by ATLAS, and the size of
the resultant lookup table is almost certainly no more than 3x the
number of sites (and therefore basically trivial)?

Sam