Print

Print


Yaim installation/configuration of WN's doesn't seem to be working

The yaim installation/configuration of the worker nodes at BNL-LCG2 failed not only the R-GMA component but also
Torque. I ran by hand the config_torque_client script but looks like none of the pbs related stuff that was expected
to have been previously installed is really there... Mainly, the complaints are about missing pbs utilities such as pbsnodes

or config files such as /var/spool/pbs/mom_priv/config. Actually, the only file that exists in /var/spool/pbs following
yaim installation is server_name. Running a "rpm -qa | grep -I pbs" on any of the worker nodes doesn't yield
anything while the same run on the CE shows four pbs related rpm files:

[root@lcg-ce01 yaim]# rpm -qa | grep pbs
edg-pbs-utils-1.0.8-1_sl3
vdt_globus_jobmanager_pbs-VDT1.2.0rh9-1
lcg-pbs-utils-1.0.0-1
lcg-info-dynamic-pbs-1.0.5-1_sl3

Did anybody experience this problem with yaim when installing the WN's ? Also, what pbs related rpm's need
to be installed and where can they be downloaded from ?  Thanks.

Edward Nicolescu
BNL-LCG2

PS: Below you will find the output from a "verbose" run of config_torque_client


[root@lcg-wn01 yaim]# /tmp/config_torque_client
+ export CE_HOST=lcg-ce01.usatlas.bnl.gov
+ CE_HOST=lcg-ce01.usatlas.bnl.gov
+ export SE_HOST=lcg-se01.usatlas.bnl.gov
+ SE_HOST=lcg-se01.usatlas.bnl.gov
+ export INSTALL_ROOT=/opt
+ INSTALL_ROOT=/opt
++ eval echo '$CE_HOST'
+++ echo lcg-ce01.usatlas.bnl.gov
+ '[' xlcg-ce01.usatlas.bnl.gov = x ']'
++ eval echo '$SE_HOST'
+++ echo lcg-se01.usatlas.bnl.gov
+ '[' xlcg-se01.usatlas.bnl.gov = x ']'
+ INSTALL_ROOT=/opt
+ echo lcg-ce01.usatlas.bnl.gov
++ grep pbs_mom /etc/services
+ '[' 'xpbs_mom     15002/tcp' = x ']'
++ grep pbs_remom /etc/services
++ grep tcp
+ '[' 'xpbs_remom     15003/tcp' = x ']'
++ grep pbs_remom /etc/services
++ grep udp
+ '[' 'xpbs_remom     15003/udp' = x ']'
+ cat
+ cat
+ '[' -f /etc/ssh/ssh_known_hosts ']'
+ grep -v lcg-ce01.usatlas.bnl.gov /etc/ssh/ssh_known_hosts
+ /usr/bin/ssh-keyscan -t rsa lcg-ce01.usatlas.bnl.gov
+ '[' 0 = 0 ']'
+ mv /etc/ssh/ssh_known_hosts.tmp /etc/ssh/ssh_known_hosts
+ '[' -f /etc/ssh/ssh_known_hosts ']'
+ grep -v lcg-se01.usatlas.bnl.gov /etc/ssh/ssh_known_hosts
+ /usr/bin/ssh-keyscan -t rsa lcg-se01.usatlas.bnl.gov
+ '[' 0 = 0 ']'
+ mv /etc/ssh/ssh_known_hosts.tmp /etc/ssh/ssh_known_hosts
+ /opt/edg/sbin/edg-pbs-knownhosts
Can't exec "/usr/bin/pbsnodes": No such file or directory at /opt/edg/sbin/edg-pbs-knownhosts line 3
2.
Could note open /usr/bin/pbsnodes -a pipe: No such file or directory
+ cat
/tmp/config_torque_client: line 64: /var/spool/pbs/mom_priv/config: No such file or directory
+ /sbin/chkconfig pbs_mom on
error reading information on service pbs_mom: No such file or directory
+ /etc/rc.d/init.d/pbs_mom stop
/tmp/config_torque_client: line 74: /etc/rc.d/init.d/pbs_mom: No such file or directory
+ sleep 1
+ /etc/rc.d/init.d/pbs_mom start
/tmp/config_torque_client: line 76: /etc/rc.d/init.d/pbs_mom: No such file or directory
+ '[' '!' -f /var/spool/cron/root ']'
+ grep edg-pbs-knownhosts /var/spool/cron/root
+ cat
+ chmod +x /etc/cron.d/edg-pbs-knownhosts
+ grep mom_logs /var/spool/cron/root
+ cat
+ chmod +x /etc/cron.d/mom_logs