Hi yes it'is know under the bugs #6603 ( resolved on CVS)
To solve this , i had just modify the lcg-info-dynamic-pbs script to
count the number of entry on jobs attribut ( incoming from pbsnode -a)
and not only see if the state status is free .
Eric
Wei Xing wrote:
> Hi, all,
>
> There is problem with our torque job manager. In our workernodes, I
> got such information with "pbsnodes -a" :
>
> =====================================================================================
>
> wn106.grid.ucy.ac.cy
> state = free
> np = 2
> properties = lcgpro
> ntype = cluster
> jobs = 0/3940.ce101.grid.ucy.ac.cy, 1/3256.ce101.grid.ucy.ac.cy
> status = arch=linux,uname=Linux wn106.grid.ucy.ac.cy
> 2.4.21-20.ELsmp #1 SMP Thu Sep 2 16:47:25 CDT 2004
> i686,sessions=3359,nsessions=1,nusers=1,idletime=151543,totmem=3065116kb,availmem=2612552kb,physmem=1024872kb,ncpus=4,loadave=0.00,rectime=1108626381
>
>
> =======================================================================================
>
>
> You can see, there are two jobs running on wn106, BUT the state is still
> FREE. Thus the jobs continue comes to my CE, makes the queue full of
> jobs.
>
> ce101.grid.ucy.ac.cy:
> Req'd
> Req'd Elap
> Job ID Username Queue Jobname SessID NDS TSK Memory Time
> S Time
> --------------- -------- -------- ---------- ------ --- --- ------ -----
> - -----
> 104.ce101.grid. lhcb001 lhcb STDIN 7798 1 -- -- 48:00
> E 45:59
> 2190.ce101.grid lhcb001 lhcb STDIN 20952 1 -- -- 48:00
> E 10:14
> 3256.ce101.grid lhcb001 lhcb STDIN 23316 1 -- -- 48:00
> E 00:00
> 3870.ce101.grid lhcb001 lhcb STDIN 10357 1 -- -- 48:00
> E 15:08
> 3872.ce101.grid lhcb001 lhcb STDIN 11166 1 -- -- 48:00
> E 15:03
> 3940.ce101.grid lhcb001 lhcb STDIN 30751 1 -- -- 48:00
> E 06:18
> 4014.ce101.grid dteam002 short STDIN 2826 1 -- -- 00:15
> E 00:00
> 4652.ce101.grid lhcb001 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4653.ce101.grid lhcb001 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4654.ce101.grid lhcb001 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4715.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4716.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4717.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4718.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4719.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4720.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4721.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4722.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4723.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4724.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4725.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4727.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4726.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4728.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4729.ce101.grid lhcb002 lhcb STDIN -- 1 -- -- 48:00
> Q --
> 4929.ce101.grid dteam021 short qsub.test. -- -- -- -- 00:15
> Q --
>
> ===========================================================
>
> Does any one have any idea about it?
>
>
> Regards
>
> Wei
>
>
> --
> ============================================================
> Wei Xing, M.Sc.
> Research Associate Tel: 00357-22892663
> Dept. of Computer Science Fax: 00357-22892701
> University of Cyprus email: [log in to unmask]
> PO Box 20537
> CY1678, Nicosia, CYPRUS
--
--------------------------------------------------------------
FEDE ERIC Mail : [log in to unmask]
CPPM Mail : [log in to unmask]
163 Av de Luminy case 902 Tel : (+33) (0)4 91 82 76 41
13288 Marseille Cedex 9 France Fax : (+33) (0)4 91 82 72 99
--------------------------------------------------------------
|