Matt,
this is just a hunch. When you have "runnable" single jobs, are there
unrunnable whole-node-jobs in the queue in front of them?
Reason for asking: Maui pops from the job queue only until it hits the
first unrunnable (for whatever reason) job . So it never looks deeper
into the queue beyond the first unrunnable job - there may be runnable
jobs in the queue but maui would never reach them. Instead, it applies
some tetris-style "backfilling" algorithm (which is is broken).
Anyway, the problem "may" be down to this scenario (it's an idea,
anyway). Say the queue is sorted as follows: W1,W2,W3,S1,S2,S3 (i.e.
three whole-node-jobs, three-single-jobs) and let us say his cluster has
two whole-worker-nodes (WN1,WN2) and two single-worker-nodes (SN1,SN2).
On a scheduler cycle, W1 and W2 would be dispatched to WN1 and WN2,
leaving the queue as W3,S1,S2,S3. Maui cannot schedule the next job (W3)
as no node can take it. Maui does not look deeper into the job queue as
stated above. So, even though S1, S2 and S3 "could" be scheduled, they
are not scheduled. Instead, some broken "backfilling" algorithm is
invoked, that is supposed to "gap fill" the other jobs. Like I said,
it's broken in some way, so I am reliably told - I don't know how, but
it leaves queued jobs just sitting there even when slots exist to run them.
Summary: you'll only get the single-jobs to run when there are no
unrunnable whole-node-jobs in front of them. To test, kill the
unrunnable whole-node-jobs in front of the queued single-jobs - you will
then see the single-jobs start.
Please let me know if this is the issue. I don't know any fix, yet. I've
been looking for a good excuse to fix this issue, by rolling our own maui.
Steve
--
Steve Jones [log in to unmask]
System Administrator office: 220
High Energy Physics Division tel (int): 42334
Oliver Lodge Laboratory tel (ext): +44 (0)151 794 2334
University of Liverpool http://www.liv.ac.uk/physics/hep/
|