As a few jobs I submitted to RAL ended up in a queued status for a very
long time, I gave a look to the status of pbs on
lcgce01.gridpp.rl.ac.uk. This is what I see:
(leonardi@adc0014) ~/grid/test> globus-job-run lcgce01.gridpp.rl.ac.uk
/usr/bin/pbsnodes -a
lcg0001.gridpp.rl.ac.uk
state = free
np = 2
speed = 0
properties = lcgpro
ntype = cluster
lcg0002.gridpp.rl.ac.uk
state = free
np = 2
speed = 0
properties = lcgpro
ntype = cluster
lcg0003.gridpp.rl.ac.uk
state = free
np = 2
speed = 0
properties = lcgpro
ntype = cluster
lcg0004.gridpp.rl.ac.uk
state = free
np = 2
speed = 0
properties = lcgpro
ntype = cluster
lcg0005.gridpp.rl.ac.uk
state = free
np = 2
speed = 0
properties = lcgpro
ntype = cluster
(leonardi@adc0014) ~/grid/test> globus-job-run lcgce01.gridpp.rl.ac.uk
/usr/bin/qstat
Job id Name User Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
376.lcgce01 STDIN alice001 0 Q
infinite
377.lcgce01 STDIN alice001 0 Q
long
378.lcgce01 STDIN dteam004 0 Q
short
379.lcgce01 STDIN dteam004 0 Q
short
380.lcgce01 STDIN dteam004 0 Q
short
381.lcgce01 STDIN dteam004 0 Q
short
382.lcgce01 STDIN dteam004 0 Q
short
383.lcgce01 STDIN dteam004 0 Q
short
384.lcgce01 STDIN dteam004 0 Q
short
385.lcgce01 STDIN dteam004 0 Q
short
386.lcgce01 STDIN dteam004 0 Q
short
387.lcgce01 STDIN dteam004 0 Q
short
388.lcgce01 STDIN dteam004 0 Q
short
389.lcgce01 STDIN dteam004 0 Q
short
390.lcgce01 STDIN dteam004 0 Q
short
391.lcgce01 STDIN dteam003 0 Q
short
This means that, even if all WNs are free, all incoming jobs are just
queued to pbs but they are not started.
Can the RAL site managers give a look to the CE and see what's
happening?
Thanks, ciao
Emanuele
--
/------------------- Emanuele Leonardi -------------------\
| eMail: [log in to unmask] - Tel.: +41-22-7674066 |
| IT division - Bat.31 2-012 - CERN - CH-1211 Geneva 23 |
\---------------------------------------------------------/
|