On Fri, 1 Sep 2006 [log in to unmask] wrote:
> [...]
> >
> > It reports tat '78.ce.hpc.iit.bme.hu' detected with unexpected state '11', and
> > if we run
> > [ce] /var/spool/pbs/server_priv/jobs > checkjob 78
> > ERROR: 'checkjob' failed
> > ERROR: cannot locate job '78'
Google found these notes from a Fermilab admin for a similar problem:
-----------------------------------------------------------------------------
qsub: Invalid request MSG=job 2424790.lumber-clued0.fnal.gov in unexpected
state 'TRANSICM'
Job files for 2424790 is on d0cabsrv1, but the appear to be corrupted or
unreadable. Qdel fails with "unknown job id". Only solution is to rm job
files and restart the pbs server.
[root@d0cabsrv1 jobs]# qstat -f 2424790.lumber-clued0@d0cabsrv1
qstat: Unknown Job Id 2424790.lumber-clued0.fnal.gov
[root@d0cabsrv1 jobs]# mv 2424790* /tmp/
[root@d0cabsrv1 jobs]# service pbs_server restart
-----------------------------------------------------------------------------
Try moving the files for job 78 out of the way and restart the pbs server...
|