Shalom Yan,
Have you tried to execute, as dteam001 (su - dteam001) "qsub pbs_sub"
where pbs_sub is an executable file reading:
#!/bin/bash
date
It might be possible that you don't get the pbs_sub.o... and
pbs_sub.e... files in the ~dteam001 directory.
This means that the ssh host based authentication doesn't work between
CE and WNs.
Is the output of "pbsnodes -l" okay ? (It should be nothing, except for
the case that some WNs are down or not seen by CE).
To fix the ssh(d) host based authentication, the "pbsnodes -l" command,
the PBS system in general, make sure that:
1) on CE
a) file /etc/ssh/sshd_config has these two line:
Subsystem sftp /usr/libexec/openssh/sftp-server
HostbasedAuthentication yes
b) file /etc/ssh/ssh_config contains these two lines:
EnableSSHKeysign yes
HostbasedAuthentication yes
in the "Host *" section.
c) file /etc/ssh/shosts.equiv contains the list of WNs, one per line.
d) file /etc/hosts.equiv contains also the list of WNs, one per line.
2) on each WN
a) file /etc/ssh/ssh_config has lines:
EnableSSHKeysign yes
HostbasedAuthentication yes
in the "Host *" section.
b) file /etc/ssh/sshd_config contains lines:
Subsystem sftp /usr/libexec/openssh/sftp-server
HostbasedAuthentication yes
c) file /etc/ssh/shosts.equiv exist and contains the list of WNs, one
per line. (in case you want to be able to submit a job from a WN,
besides CE)
Try it and let me know how it goes.
L'hit,
Dan
Yan Ben-Hammou wrote:
>Hi,
>i need help. i have a pb to run a job. the creation of the proxy is ok.
>then i try something like :
>globus-job-run lcfgng.cs.tau.ac.il /bin/ls
>i obtain the error 74 :
>GRAM Job submission failed because the job manager failed to open stderr.
>
>and in the /home/atlas001 (i am in the ATLAS VO), i have the file
>gram_job_mgr_23229.log
>in which the first error seems to be :
>GLOBUS_GRAM_JOB_MANAGER_STATE_EARLY_FAILED
>this come just after it tried to open the stdout and stderr via the https.
>
>then i test something else :
>
>globus-job-run -stdout /tmp/yan.out -stderr /tmp/yan.err
>lcfgng.cs.tau.ac.il /bin/ls
>
>then i obtain the 2 files yan.out and .err in the /tmp directory with the
>good answer in the yan.out but my job doesn't stop and the files simply
>stay in the /tmp directory.
>
>i don't know if i was clear ....
> thanks for your help
> Yan
>
>
|