Hi All,
As requested, I've knocked a script together that goes through the pbs
logs and calculates the CPU (or wall) time delivered per group in
HEPSPEChours, where group in this context is a unix group, such as
atlas or atlasprd etc. If TB-support allows attachments, then the
file's attached here. Yeah it's hacky, but it grew up from a
grep-awk-sed one-liner.
Running the script is fairly easy (at Glasgow ;-) for example, to get
the CPU-delivered (in HEPSPEChours and hours) of all jobs since Oct
1st 2008, I run
./account.py -f 20081001 -l 20090604
to get the walltime instead, I'd have added the -w flag
There's a --help option available and the code's fairly well
commented. Officially this is unsupported, so, in the words of our T2
coordinator - "if it breaks, you get to keep both parts".
The following assumptions are made:
Your logs live at /var/spool/pbs/server_priv/accounting/ (see --help)
and are named YYYYMMDD
Your worker nodes are named nodeXXX (hack if they aren't)
You have up to two sub-clusters with different HEPSPEC scores. If you
don't have sub-clusters, and all nodes have the same HEPSPEC score,
set hepSpec1core and hepSpec2core to be equal in the code. If you have
more than two sub-clusters, you'll have to do deeper fiddling.
If there are any major bugs/problems, let me know...otherwise, enjoy.
Cheers,
Mike.
|