Print

Print


Shalom Yan,

Have you tried to execute, as dteam001 (su - dteam001)  "qsub pbs_sub" 
where pbs_sub is an executable file reading:
#!/bin/bash

date

It might be possible that you don't get the pbs_sub.o... and 
pbs_sub.e... files in the ~dteam001 directory.
This means that the ssh host based authentication doesn't work between 
CE and WNs.

Is the output of "pbsnodes -l" okay ? (It should be nothing, except for 
the case that some WNs are down or not seen by CE).

To fix the ssh(d) host based authentication, the "pbsnodes -l" command, 
the PBS system in general, make sure that:

1) on CE

a) file /etc/ssh/sshd_config has these two line:
Subsystem       sftp    /usr/libexec/openssh/sftp-server
     HostbasedAuthentication yes

b) file /etc/ssh/ssh_config contains these two lines:
        EnableSSHKeysign yes
        HostbasedAuthentication yes
in the "Host *" section.

c) file /etc/ssh/shosts.equiv contains the list of WNs, one per line.

d) file /etc/hosts.equiv contains also the list of WNs, one per line.

2) on each WN

a) file /etc/ssh/ssh_config has lines:
    EnableSSHKeysign yes
    HostbasedAuthentication yes
in the "Host *" section.

b) file /etc/ssh/sshd_config contains lines:
Subsystem       sftp    /usr/libexec/openssh/sftp-server
        HostbasedAuthentication yes

c) file /etc/ssh/shosts.equiv exist and contains the list of WNs, one 
per line. (in case you want to be able to submit a job from a WN, 
besides CE)

Try it and let me know how it goes.

L'hit,
Dan








Yan Ben-Hammou wrote:

>Hi,
>i need help. i have a pb to run a job. the creation of the proxy is ok.
>then i try something like :
>globus-job-run lcfgng.cs.tau.ac.il /bin/ls
>i obtain the error 74 :
>GRAM Job submission failed because the job manager failed to open stderr.
>
>and in the /home/atlas001 (i am in the ATLAS VO), i have the file
>gram_job_mgr_23229.log
>in which the first error seems to be :
>GLOBUS_GRAM_JOB_MANAGER_STATE_EARLY_FAILED
>this come just after it tried to open the stdout and stderr via the https.
>
>then i test something else :
>
>globus-job-run -stdout /tmp/yan.out -stderr /tmp/yan.err
>lcfgng.cs.tau.ac.il /bin/ls
>
>then i obtain the 2 files yan.out and .err in the /tmp directory with the
>good answer in the yan.out but my job doesn't stop and the files simply
>stay in the /tmp directory.
>
>i don't know if i was clear ....
>	thanks for your help
>		Yan
>  
>