On Fri, 2006-08-11 at 16:05 +0100, Ian Stokes-Rees wrote:
> I'm curious as to how 4000 jobs on a CE can kill it. Surely the CE for
> a large cluster would be expected to handle 10,000 or more jobs.
4000 (perl?) processes continuously polling the state of the local batch
system -- eg by invoking `qstat` every N seconds -- could easily raise
the load average of a single machine to debilitating levels.
And the site BDII facility, typically installed on the CE head node, is
rather sensitive to the local system load..
Cheers,
David
--
David McBride <[log in to unmask]>
Department of Computing, Imperial College, London
|