Hi Lorne,
> Our jobs are hanging in the job manager. Can someone explain how to get
> debug output from the job manager (lcgpbs).
I had a look on your CE:
-----------------------------------------------------------------------------
[sgmops17@wipp-ce ~]$ echo date | qsub -q ops
824825.wipp-ce.weizmann.ac.il
-----------------------------------------------------------------------------
[sgmops17@wipp-ce ~]$ qstat 824825.wipp-ce.weizmann.ac.il
qstat: Unknown Job Id 824825.wipp-ce.weizmann.ac.il
You have new mail in /var/mail/sgmops17
-----------------------------------------------------------------------------
[sgmops17@wipp-ce ~]$ mail
Mail version 8.1 6/6/93. Type ? for help.
"/var/mail/sgmops17": 22 messages 22 new
[...]
& $
Message 22:
From [log in to unmask] Thu Feb 19 15:18:23 2009
Date: Thu, 19 Feb 2009 15:18:23 +0200
From: adm <[log in to unmask]>
To: [log in to unmask]
Subject: PBS JOB 824825.wipp-ce.weizmann.ac.il
Precedence: bulk
PBS Job Id: 824825.wipp-ce.weizmann.ac.il
Job Name: STDIN
An error has occurred processing your job, see below.
Post job file processing error; job 824825.wipp-ce.weizmann.ac.il on host
eio55.weizmann.ac.il/0
Unable to copy file /var/spool/pbs/spool/824825.wipp.OU to
[log in to unmask]:/home/sgmops17/STDIN.o824825
>>> error from copy
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: POSSIBLE DNS SPOOFING DETECTED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
The RSA host key for wipp-ce.weizmann.ac.il has changed,
and the key for the according IP address 192.114.102.100
is unknown. This could either mean that
DNS SPOOFING is happening or the IP address for the host
and its host key have changed at the same time.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the RSA host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
6b:37:eb:48:2e:b2:2c:57:dd:30:41:bc:0e:83:4d:f3.
Please contact your system administrator.
Add correct host key in /home/sgmops17/.ssh/known_hosts to get rid of this
message.
Offending key in /etc/ssh/ssh_known_hosts:247
RSA host key for wipp-ce.weizmann.ac.il has changed and you have requested
strict checking.
Host key verification failed.
lost connection
>>> end error output
Output retained on that host in: /var/spool/pbs/undelivered/824825.wipp.OU
-----------------------------------------------------------------------------
So, you probably need to get rid of all stale ~/.ssh/known_hosts files
on all WNs.
|