Hello,
>>
>> Summary: you'll only get the single-jobs to run when there are no
>> unrunnable whole-node-jobs in front of them. To test, kill the
>> unrunnable whole-node-jobs in front of the queued single-jobs - you
>> will then see the single-jobs start.
>>
>> Please let me know if this is the issue. I don't know any fix, yet.
>> I've been looking for a good excuse to fix this issue, by rolling our
>> own maui.
I think this is more or less what was happening, the multicore queue
jobs get on top, run out of space so can't be started and then maui gets
all lazy with the scheduling of the "stuck" jobs. After increasing my
RESERVATIONDEPTH and setting MAXIJOBS=1 for the multicore queue
scheduling seems to be working as expected for the first time. Whether
this is due to these changes or some other factor (maybe my tears
soaking into my keyboard invoked mercy from the dark gods of cluster
computing?). There are still many improvements I want to try (liek
Stuart's suggestion at partitioning my nodes, but at least now I have a
baseline that works!
Thanks,
Matt
>>
>> Steve
>>
>
>
|