Hi, I can take a detailed look next week (this week still on vacation), but my gut feeling is that the negative response times come from flaws in the algorithm used to compute values such as "free cpus" and "total cpus". I've had to do similar things for the Copernican ERT algorithm, and it is *really hard* to get it right. Here is an example: In the latest release of torque, when a WN goes down, the node is recognizable as "down" by using a command like 'pbsnodes -a'. However, any jobs that were running on that node when it went down are still listed as running in a command like 'qstat -f'. So if you calculate running jobs solely on the basis of 'qstat -f' and free/total CPUs solely on the basis of 'pbsnodes -a' you can wind up with a negative value for FreeCPUs, since there are some of your jobs running on DeadCPUs but you are subtracting these jobs from the AvailableJobSlots value that does not include (unless your algorithm is really bad) DeadCPUs. Clear? Too bad LCG rollout rejected my last post, explaining the James Kirk / Elvis Presley theory. Maybe I will talk about that at the next LCG workshop. Fits in well with spacetime warping. J "yes I really did try to post it" T On Fri, 2005-01-07 at 00:47, Dimitris Zilaskos wrote: > Maarten Litmaath, CERN wrote: > > On Thu, 6 Jan 2005, Rod Walker wrote: > > > > > >>Hi, > >>The following CE's are showing negative gluecestateestimatedresponsetime: > >> > >>ce001.m45.ihep.su:2119/jobmanager-pbs-infinite -1 > >>heplnx131.pp.rl.ac.uk:2119/jobmanager-lcgpbs-lhcbL -3 > >>lunegw.lancs.ac.uk:2119/jobmanager-lcgpbs-infinite -1 > >>node001.grid.auth.gr:2119/jobmanager-lcgpbs-infinite -2147483647 > >>testbed001.phys.sinica.edu.tw:2119/jobmanager-lcgpbs-infinite -624366 > >> > >>Actually sinica attracts all my jobs until a few are queueing and then > >>the ERT goes sensible again. > > > > > > The Greek site is seriously warping space-time, allowing a job submitted > > today to finish some 68 years ago... :-) > > Major breakthrough of Grid Computing :) > > I suspect that this is caused by some variable wrapping when exceeding a > particular value , we have around 15 jobs in infinite queue at this > moment. Perhaps someone with knowledge on how this > gluecestateestimatedresponsetime value is calculated can help. > > Best regards , > > -- > ============================================================================= > > Dimitris Zilaskos > > Department of Physics @ Aristotle Univercity of Thessaloniki , Greece > PGP key : http://tassadar.physics.auth.gr/~dzila/pgp_public_key.asc > http://egnatia.ee.auth.gr/~dzila/pgp_public_key.asc > MD5sum : de2bd8f73d545f0e4caf3096894ad83f pgp_public_key.asc > =============================================================================