Dear Matt,

sorry for not answering earlier, I was busy yesterday afternoon with other things.

Can you give me some more information, eg. which LSF version are you running ? How many worker nodes and cores do you have in the system ? Can you give me the output of the "lshosts", "lshosts -w", "bhosts -w" command ?

(If you send them directly to me that's fine).

Cheers,

Ulrich

On Thursday, December 02, 2010 05:55:27 pm Matt Doidge wrote:

> > ps: back to Matt: your mail seems to indicate that somehow the

>

> config got messed up, as indeed the line you added to lrms_backend_cmd

> needs to be present.

>

> > The other place to look, take a good critical look at the vomap section

> > of the scheduler conf file ... YAIM gets this wrong once in awhile (are

> > you using YAIM?)

>

> Heyup,

> I've squinted at the vomap section of the

> lcg-info-dynamic-scheduler.conf (it was created using yaim, along with

> the most of our setup). Unless hidden whitespace matters it looks fine.

> I think the problem lies in the lrms_backend_cmd giving up zeros for

> nactive and nfree:

>

> #/opt/glite/libexec/lrmsinfo-lsf

> nactive 0

> nfree 0

> now 1291308084

> schedCycle 120

> {'group': 'sgmops','jobid': '22793','user': 'sgmops019','qtime':

> 1291307905.0,'queue': 'normal','state':'running','maxwalltime':

> 9999999999.0}

>

> I managed to catch an ops job runninghere, it's running, so nactive

> should be 1...right? (not counting all the other users on running jobs

> on this shared cluster).

>

> I'll try to take this up with the glite-info-dynamic-lsf developer.

>

> Thanks a lot,

> Matt

--

--------------------------------------

Dr. Ulrich Schwickerath

CERN IT/PES-PS

1211 Geneva 23

e-mail: [log in to unmask]

phone: +41 22 767 9576

mobile: +41 76 487 5602