> Which version? Sometimes packages from the apt source are older than
> the official version.
Sure, /usr/share/doc/gridengine-master/changelog.Debian.gz would insinuate 6.2-4
> It is said that the queue is optimised in 6.2U2, however, I haven't
> got time to test it. I'm still using 6.2U1, I don't know why sometimes
> the queue would be freezing for quite a long time, until I manually
> restart the master service. Does anyone have this experience before?
Not I. I have found SGE to be very reliable. what does qstat output at
this stage? are there a bunch of jobs in error state are blocking the
queues? One thing I also do for large installations is to reduce the
scheduler wait time to 5 seconds if there are many small jobs.
$ qconf -msconf
schedule_interval 0:0:5
Also be sure that it is not just that your machines are all overloaded
and/or you haven't set the load_threshold appropriately. For a 4 core
machine the default of 1.75 will mean that if the load is above that
no more jobs will start. For a series of 4 core machines this value
should be at least 3.
use:
$ qconf -sq all.q
to check this. Mind you if you are having troubles with SGE, the SGE
forums on sunsource are always very helpful (I have found).
--
Andrew Janke
([log in to unmask] || http://a.janke.googlepages.com/)
Canberra->Australia +61 (402) 700 883
|