Another chapter in the eternal discussions about batch scheduling...
Our site runs mostly CPU-intensive LHC VO "production" jobs and
'pheno' VO "MC" jobs. The LHC VO jobs are mostly part of
automated mass flows, while the 'pheno' VO jobs are mostly from
individual users (but essentially all 'pheno' work is strongly
LHC related of course) and have a very bursty pattern
(occasional runs of something like 2,000-7,000 jobs).
The problem is that users tend to like their jobs to have short
latency, and they dislike seeing unused CPUs; but if I set the
scheduler to let them use unused CPUs even if their "fairshare"
is used up, when LHC VO jobs come in they have to wait for the
user jobs to end. Viceversa if I let the flood of LHC VO jobs
take over, the latency of user jobs is affected.
Obviously to have really minimal latency for everybody there
should always be some spare capacity, but that's quite
expensive, and I dislike that too.
I have been thinking of allowing groups/users to go above their
fairshare but only for short-duration jobs (but so many pheno'
jobs are long duration that is may not be worthwhile), to
minimize the latecy impact of overallocations.
What kind of policy is compatible with LHC/WLCG/GridPP goals?
Because I suspect that T2 latencies of a few/several days do not
matter that much.
What kind of fairshares are other sites running?
Any sample MAUI configs to have a look at?
|