JISCMail - TB-SUPPORT Archives

Hi,

During the discussion in Steve Lloyd's Tier-2 Hardware Allocation 
Algorithms session in Brighton

( http://www.gridpp.ac.uk/gridpp26/SLL_GridPP26.pdf )

we agreed to maintain a simple file that maps sites to HEPSPEC06 
figures. This will be used by Steve's metrics script to weight the CPU 
seconds from jobs according to the actual performance of the machine 
used. (The current version of Steve's script is just based on number of 
jobs run: http://pprc.qmul.ac.uk/~lloyd/gridpp/metrics.html )

In practical terms, we're proposing that we maintain this file:

http://www.gridpp.ac.uk/deployment/metrics/cequeue-hepspec06percore.txt

which maps the CE-queue names ("UKI-NORTHGRID-MAN-HEP-ce01-long-pbs" 
etc) to HEPSPEC06 figures per core. The figures submitted need to be the 
weighted average for that CE-queue if there are machines of different 
performance accessible via the same queue.

These can then be combined with experiment CPU time figures for jobs by 
Steve's script. The ATLAS Dashboard, for instance, has the job CPU 
times, and these can be extracted according to those CE-queue names, 
which are also listed on the 'any ce' pull down menu on that page.

Does that all sound reasonable?

There's still a need to agree which conditions to use when calculating 
the total HEPSPEC06 per machine, especially if hyperthreading is enabled 
when accepting jobs. For more background, there's a list of HEPSPEC06 
figures for various configurations used by sites in the Wiki:

  http://www.gridpp.ac.uk/wiki/HEPSPEC06

Cheers,

  Andrew and Alessandra

--------------------------------------------------------------
Dr Andrew McNab, High Energy Physics, University of Manchester