On Wed, Sep 28, 2005 at 11:34:19AM +0100 or thereabouts, Morag Burgon-Lyon wrote:
> Hi,
>
>
>
> Steve is away on holiday this week, so I'm holding the fort at Scotgrid Edinburgh. I added in two new worker nodes (wn0 and wn4) yesterday by altering the wn-list.conf and running the yaim script with WN_torque on the new worker nodes, and CE_torque on the ce. I amended the processor numbers, copied across the maui config and restarted maui, pbs_server and pbs_mom on ce.
>
>
>
> However, the queues haven't been filling back up and new jobs appear briefly and then disappear. Also qsub doesn't work from any node (including the existing nodes that worked fine before adding the new ones, such as wn2):
Hi Morag,
The problem sounds like unchallenged ssh from WN to CE not working.
See.
http://goc.grid.sinica.edu.tw/gocwiki/ssh_problem_from_WN_to_CE
To test login into your new WNs
# su - dteam050
dteam050> ssh yource.ed.ac.uk
Use the full hostname, does it work unchalleged?
Also if it is this there will be some files in
/var/spool/pbs/undelivered on the affected WNs.
Steve
>
>
>
> [dteam001@wn2 dteam001]$ qsub qsubtest.sh
>
> qsub: Bad UID for job execution
>
> [dteam001@wn2 dteam001]$
>
>
>
> I've compared the /etc/passwd files for ce and the old worker nodes, and dteam001 has the same uid and gid in both, however I noticed that alice001 was different (ce looked wrong as it has a uid of 10000). Also, the UIDs and GIDs in the users.conf file are different to the ones in /etc/passwd on all nodes and ce. The upgrade was done using lcg-yaim-2.6.0-9.
>
>
>
> Any suggestions?
>
>
>
> Thanks,
>
> Mòrag
>
--
Steve Traylen
[log in to unmask]
http://www.gridpp.ac.uk/
|