On Wed, Aug 15, 2007 at 12:46:48PM +0100, Brew, CAJ (Chris) wrote:
> Hi,
>
> Is there any policy about multithreaded jobs?
>
> I've job a few jobs on my system taking the load on their workers to
> >100 and looking at them I see many processed owned by the same user all
> using more-or-less equal CPU time. And the cpu time for the job is going
> up at 2 x the wall time.
>
> Now the nodes do seem to be coping with the load and the CPU Time
> accounting seems to be correct but this always used to be regarded as
> "cheating".
The problem is that other jobs will run slower which can cause proxies
to expire or if your accounting,queue time limit is on wall clock time
the jobs will be penaltized unfairly.
I haven't given the problem much thought yet since I haven't noticed
any multithreaded jobs in our cluster (actually they don't even have
to be multithreaded you can just do a job1& job2& ...) but the best
solution is to use taskset -c N job to bind the job to a specific cpu
(numactl is also an option for opteron systems and probably better since
you can also give preference to the local memory for the cpu).
As you can guess what I haven't looked at is how to get the batch system
to allocate a free cpu for each job.
Which VO is running the multithreaded jobs btw?
Cheers,
Kostas
|