> At the moment there aren't plans for integrating SLURM
Why not? Torque to be fair is free, but buggy as hell and in real life torque+maui doesn't really work well if one has 2k+ compute cores and wants to keep say 4k+ in queue as well. For example maui can't really handle fairshare of more than ca 4k jobs so if one has more in queue it stops working and only the first part is used of the queue. LSF is expensive and SGE free version goes up to a level and then stops because Oracle made it commercial so future fixes and updates are hard to come by...
Slurm is a free scheduler, that according to reports works quite well and is in use for clusters with 100k+ cores meaning it HAS to scale well. So why would that not even be in the plans considering that the site sizes will continue increasing with the continued operation of LHC and clusters of 3k-6k cores should become more and more common. I can't even fathom what it would take to have torque address a 10k core cluster. It'd probably give up during startup :p
Mario Kadastik, PhD
Researcher
---
"Physics is like sex, sure it may have practical reasons, but that's not why we do it"
-- Richard P. Feynman
|