This looks like if PBS is not working correctly on your cluster. Try to
setup and run PBS before you run /opt/edg/sbin/globus-initialization.sh
Andreas
On Tue, 15 Feb 2005, Sotomayor, Maniel wrote:
> Hello,
>
> I'm trying to configure my WN farm with torque support as documentation without success. Also I'm having unconsistencies with different errors while running the configure scripts.
> Here is what is logged after creating the job manager config gile.
>
> VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
>
> Setting up condor gram reporter in MDS
> ----------------------------------------
> configure: error: Cannot locate condor_q
> loading cache /dev/null
> checking for condor_q... no
> Error locating condor commands, aborting!
> Setting up lsf gram reporter in MDS
> ----------------------------------------
> configure: error: Cannot locate lsload
> loading cache /dev/null
> checking for lsload... no
> Error locating LSF commands, aborting!
> configure: warning: Cannot locate mpirun
> loading cache ./config.cache
> checking for mpirun... (cached) no
> creating ./config.status
> creating fork.pm
> configure: warning: Cannot locate mpirun
> loading cache /dev/null
> checking for mpirun... no
> checking for qdel... /usr/bin/qdel
> checking for qstat... /usr/bin/qstat
> checking for qsub... /usr/bin/qsub
> checking for ssh... /usr/bin/ssh
> updating cache /dev/null
> creating ./config.status
> creating /opt/globus/lib/perl/Globus/GRAM/JobManager/pbs.pm
> No default server name.
> qstat: cannot connect to server (null) (errno=15034)
> configure: error: Cannot locate condor_submit
> loading cache /dev/null
> checking for condor_submit... no
> Error locating condor commands, aborting!
> configure: warning: Using default of /etc for LSF_ENVDIR
> configure: error: LSF configuration /etc/lsf.conf not found.
> loading cache /dev/null
> Error locating LSF commands, aborting!
> loading cache ./config.cache
> creating ./config.status
> creating grid-cert-request-config
> creating grid-security-config
> Configuring config_crl ...
> Configuring config_replica_manager ...
> Configuring config_edgusers ...
> Configuring config_users ...
> Configuring config_rgma ...
> Configuring config_workload_manager_env ...
> Configuration Complete
> Configuring config_torque_client ...
> /opt/lcg/yaim/scripts/configure_WN_torque: line 13: /var/spool/pbs/server_name:No such file or directory
> No default server name.
> /usr/bin/pbsnodes: cannot connect to server , error=15034
> /opt/lcg/yaim/scripts/configure_WN_torque: line 60: /var/spool/pbs/mom_priv/config: No such file or directory
> Stopping pbs_mom: [FAILED]
> Starting pbs_mom: pbs_mom unable to go home: No such file or directory
> [FAILED]
> Configuration Complete
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> I think that this error gives a hint:
> >>> No default server name.
> >>> qstat: cannot connect to server (null) (errno=15034)
>
> Any ideas ?
> I've been stuck with this for like 2-3 days. The thing is that, this error appears for some WN but not for all. For other WN, different errors appear; and all of them with the same site-info.def config file.
>
> Any help please ?
>
> ./MS
>
--
Andreas Unterkircher
IT Department
CERN
CH-1211 Geneva 23
http://cern.ch/openlab
|