Print

Print


> What about the pbs_mom logs on one the workers.
>
>   Steve
>

CE:
03/13/2003 11:46:49;0002;   pbs_mom;Svr;Log;Log opened
03/13/2003 11:46:49;0001;   pbs_mom;Svr;pbs_mom;im_eof, End of File from
addr 138.37.50.250:1022
03/13/2003 11:46:50;0002;   pbs_mom;Svr;pbs_mom;caught signal 15
03/13/2003 11:46:50;0002;   pbs_mom;Svr;pbs_mom;Is down
03/13/2003 11:46:50;0002;   pbs_mom;Svr;Log;Log closed
03/13/2003 11:46:51;0002;   pbs_mom;Svr;Log;Log opened
03/13/2003 11:46:51;0002;   pbs_mom;Svr;restricted;hepbf4.ph.qmul.ac.uk
03/13/2003 11:46:51;0002;   pbs_mom;Svr;ideal_load;1.6
03/13/2003 11:46:51;0080;   pbs_mom;n/a;add_static;config[0] add name ideal_load value 1.6
03/13/2003 11:46:51;0002;   pbs_mom;Svr;max_load;2.1
03/13/2003 11:46:51;0080;   pbs_mom;n/a;add_static;config[0] add name max_load value 2.1
03/13/2003 11:46:51;0002;   pbs_mom;n/a;initialize;independent
03/13/2003 11:46:51;0002;   pbs_mom;Svr;pbs_mom;Is up

WN:

3/13/2003 11:46:49;0002;    pbs_mom;Svr;Log;Log opened
03/13/2003 11:46:49;0001;   pbs_mom;Svr;pbs_mom;im_eof, End of File from addr 138.37.50.250:1022


On Thu, 13 Mar 2003, Steve Traylen wrote:

> What about the pbs_mom logs on one the workers.
>
>   Steve
>
> On Thu, 13 Mar 2003, D.Kant wrote:
>
> > Hmm...
> >
> > Log files show this:
> >
> > 03/13/2003 11:46:51;0002; pbs_sched;Svr;Log;Log opened
> > 03/13/2003 11:46:51;0002; pbs_sched;Svr;die;caught signal 15
> > 03/13/2003 11:46:51;0002; pbs_sched;Svr;Log;Log closed
> > 03/13/2003 11:46:51;0002; pbs_sched;Svr;Log;Log opened
> > 03/13/2003 11:46:51;0002; pbs_sched;Svr;main;/usr/pbs/sbin/pbs_sched startup pid 10692
> > 03/13/2003 11:46:52;0080; pbs_sched;Svr;main;brk point 134688768
> >
> > Dave.
> >
> >
> > On Thu, 13 Mar 2003, Frederic Brochu wrote:
> >
> > >         Hello,
> > >
> > > You have certainly a problem with PBS: 6 jobs are queued and none running:
> > >
> > > [brochu@gppui04 scripts]$ globus-job-run hepbf4.ph.qmul.ac.uk
> > > /usr/pbs/bin/qstat -a
> > >
> > > hepbf4.ph.qmul.ac.uk:
> > >                                                             Req'd  Req'd
> > > Elap
> > > Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S
> > > Time
> > > --------------- -------- -------- ---------- ------ --- --- ------ ----- -
> > > -----
> > > 0.hepbf4.ph.qmu iteam001 S        STDIN         --   --   1    --  30:00 Q
> > > --
> > > 1.hepbf4.ph.qmu cms001   M        STDIN         --   --   1    --  30:00 Q
> > > --
> > > 2.hepbf4.ph.qmu cms001   L        STDIN         --   --   1    --  30:00 Q
> > > --
> > > 3.hepbf4.ph.qmu cms001   S        STDIN         --   --   1    --  30:00 Q
> > > --
> > > 4.hepbf4.ph.qmu iteam001 S        STDIN         --   --   1    --  30:00 Q
> > > --
> > > 5.hepbf4.ph.qmu iteam001 L        STDIN         --   --   1    --  30:00 Q
> > > --
> > >
> > >                 Best regards,
> > >                                         Frederic
> > >
> > > On Thu, 13 Mar 2003, D.Kant wrote:
> > >
> > > >  Hi Everyone,
> > > >
> > > >   I still don't see why I'm not green.
> > > >   Both hepbf2 and hepbf4 are reporting information which is accessible via IC RB
> > > >   http://www.hep.ph.ic.ac.uk/~dguser/diagnostics.html
> > > >
> > > >   Details are my setup are here:
> > > >   http://hepwww.ph.qmul.ac.uk/~kant/post-install.html
> > > >
> > > >   Perhaps there are some "misconfigurations" in my site-cfg.h such as
> > > >   the TOP GIIS?
> > > >
> > > >   If a potential firewall problem, then what ports MUST be open??
> > > >
> > > >  My CE reports that both itself and the WN are ready to process jobs:
> > > >  "/usr/pbs/bin/pbsnodes -a"
> > > >  hepbf4.ph.qmul.ac.uk
> > > >      state = free
> > > >      np = 2
> > > >      ntype = cluster
> > > >
> > > >  hepbf5.ph.qmul.ac.uk
> > > >      state = free
> > > >      np = 2
> > > >      ntype = cluster
> > > >
> > > >
> > > > Dave.
> > > >
> > > >
> > > > >Date: Tue, 11 Mar 2003 21:18:29 +0000 (GMT)
> > > > >From: Andrew McNab <[log in to unmask]>
> > > > >To: D.Kant <[log in to unmask]>
> > > > >Subject: Re: New map page and amber status
> > > > >
> > > > >On Tue, 11 Mar 2003, D.Kant wrote:
> > > > >
> > > > >>
> > > > >> This is good...but clearly i'm a bit puzzled why we don't have
> > > > >> a green :-(
> > > > >>
> > > > >> What things must be "in place" in order to get communication via the
> > > > >> Gridpp RB?
> > > >
> > > > >Looks like I can't get a job submission to you to get past the Scheduled
> > > > >state. This may well be a firewall problem (EDG uses Globus's Two Phase
> > > > >Commit job submission and the firewall requirements for this are even
> > > > >worse than for default globus-job-submit.) It's worth asking on the
> > > > >tb-support list to get Steve Traylen and Dave Colling's advice (and other
> > > > >people) advice about getting thing working via the RB.
> > > > >
> > > > >I think you are very close to getting it working btw.
> > > > >
> > > > >Cheers,
> > > > >
> > > > > Andrew
> > > >
> > >
> >
> > --
> > --------------------------------------------------------------
> > Department of Physics            | Dr Dave Kant
> > Queen Mary College               | TEL/FaX: +44 (0)20 7882 5054
> > Mile End Road  London  E1 4NS    | e-mail : [log in to unmask]
> > --------------------------------------------------------------
> >
>
> --
> Steve Traylen
> [log in to unmask]
> http://www.gridpp.ac.uk/
>

--
--------------------------------------------------------------
Department of Physics            | Dr Dave Kant
Queen Mary College               | TEL/FaX: +44 (0)20 7882 5054
Mile End Road  London  E1 4NS    | e-mail : [log in to unmask]
--------------------------------------------------------------