On Tue, May 17, 2005 at 11:14:23AM -0400, Sotomayor, Maniel wrote:
> Hello all,
>
> One of our site's CE PBS server seems to be stalled. Lots of jobs are in
> the "E" state for too long; actually they stay stalled that way.
> Whereas, most of the jobs are in the Q state with a message like "Not
> Running: Draining system to allow starving job to run". I'm afraid to
> kill any of the jobs. What would be the correct procedure for
> correcting this? I restarted the pbs_server process after restarting
> pbs_mom from one of the working nodes that was executing the "E" state
> job, but didn't work.
>
> Any help?
In "/var/spool/pbs/sched_priv/sched_config" set:
help_starving_jobs false ALL
and restart pbs_sched and pbs_server daemons.
Maybe this will help.
Cheers,
Patryk
--
Patryk Lason
[log in to unmask]
phone: (+48 12) 6323355 e.107, http://www.cyfronet.pl
|