Happy Friday!
A job yesterday produced an interesting log entry:
Oct 28 16:50:09 bse08 kernel: whizard[17144] trap divide error rip:63fd80b
rsp:7fff05f30f40 error:0
I can't see from a quick scan of torque/pbs admin manual how to ask pbs
which jobs were running on this node at this time. The Maui admin manual
mentions mprof & profiler but has 0 info, moab manual does not seem to
have any mprof or profiler...
Does anyone know the recipe of how, as quick+easy as poss, to ask torque
&/or maui which jobs ran on a given WN in a given time-window?
I'd be very grateful for an example!
Is it in gLite doc somewhere? Or does one query rgma, or the MON mysql db
directly? Again an example would be much appreciated!
I do have pbswebmon for tht CE+WN but AFAIK pbswebmon only presents an
instant snapshot of the cluster & keeps 0 historical info.
Or does anyone know different, if so how to have pbswebmon show what jobs
ran on a given node in a given time-window?
Yours in query-mode!
wl
|