Hi Winnie
The large submission by CMS will have put sufficient stress on your
CE to cause the information system plugin to timeout. Then the GRIS
falls back to a default value for running/queued jobs of 4444 - just
a plain old lie, really! (-1 would have been a rather more sensible
choice, I suspect, but the important thing was to signal something
sufficiently off-putting to the RB to ward off any more jobs.)
gStat then just reports what is published in the information system.
Once the CE started to get some breathing room, the plugins will run
correctly and you start to report the true values again.
At larger sites, where submissions of 1000+ jobs are not rare, this
is a known problem.
BTW, it does help to separate the sBDII from the CE.
Cheers
Graeme
On 14 Sep 2007, at 10:59, Winnie Lacesso wrote:
> Dear *,
>
> My site's gstat page shows that 32,000 jobs arrived instantly then
> presumably ran (or otherwise exited) about midnight last night:
>
> http://goc.grid.sinica.edu.tw/gstat/UKI-SOUTHGRID-BRIS-HEP/
>
> 4000jobs each for each VO Bristol supports.
> It is true that CMS submitted a few hundred short jobs w/in last
> couple days,
> they all ran & finished; my knowledge of reality bears this out.
> Bris CE PBS accounting logs show about 500 CMS short jobs ran
> yesterday;
> sar on CE does show a fair load peak ca.03:50-04:30 but not truly
> huge.
>
> But 32,000 jobs? False; where is gstat getting this figure?
>
> Gstat shows similar for RalPP of 100,000 jobs waiting & presumably
> run/exited
> in a very short time, very early this morning, 4000 per VO they
> support:
>
> http://goc.grid.sinica.edu.tw/gstat/UKI-SOUTHGRID-RALPP/
>
> Oxford shows right now with ca.90,000 jobs queued, again ca.4000
> per VO:
> http://goc.grid.sinica.edu.tw/gstat/UKI-SOUTHGRID-OX-HEP/
>
> I know these gstat numbers for Bristol are unreal, don't know how
> to confirm
> the other sites.
> Not all sites show this odd peak last night / this morning.
>
> Was there a submission challenge last night?
>
> If we know gstat numbers are bogus, is that data being believed by
> anyone assessing our sites? Or is gstat known to be weirdly
> unreliable?
>
> Thanks.
--
Dr Graeme Stewart - http://wiki.gridpp.ac.uk/wiki/User:Graeme_stewart
ScotGrid - http://www.scotgrid.ac.uk/ http://scotgrid.blogspot.com/
|