On Wed, Jun 06, 2007 at 10:32:26AM +0300, Yiannis Ioannou wrote:
> Hello Nikola,
>
> >From the previous posts to the list, I assume that you have used the
> following format for your computing element:
>
> ============================
> 15057:opssgm02:1520,1500:opssgm,ops:ops:sgm:
> 15058:opssgm03:1520,1500:opssgm,ops:ops:sgm:
> 15059:opssgm04:1520,1500:opssgm,ops:ops:sgm:
> 15060:opssgm05:1520,1500:opssgm,ops:ops:sgm:
> =============================
>
> If you have configured the computing element with a user configuration file
> containing entries like the above, it will not work. The reason is that the
> created users (opssgm001...) will have as primary group the opssgm group,
> but in order to work, the user accounts must primarily belong to ops. Only
> the primary group is checked against the acl_group variable of the pbs
> server.
>
> Check that in the computing element
>
> ->qmgr -c "p s" |grep acl
> set queue ops acl_group_enable = True
> set queue ops acl_groups = ops
>
> The above means that only users with primary group ops have access to the
> ops queue.
>
> In order to resolve this issue, you have two options.
>
> First one, make ops the primary group of the opssgm pool accounts.
> =====================================
> 15060:opssgm05:1500,1520:ops,opssgm:ops:sgm:
> =====================================
>
> or second, add the opssgm group to the acl_groups
> ->qmgr -c "set queue ops acl_groups+=opssgm"
>
> as it is described in the torque manual,
> http://www.clusterresources.com/torquedocs21/4.1queueconfig.shtml
Or third, which is the correct ("by the book") one: Setup correctly the
OPS_GROUP_ENABLE parameter, along with all the other VO_GROUP_ENABLE
parameters in site-info.def.
>
>
> best regards,
> Yiannis
>
>
> On 6/5/07, Maarten Litmaath <[log in to unmask]> wrote:
> >Nikolaos Vidiadakis wrote:
> >
> >> Good evening to all,
> >>
> >> I reconfigured everything but the problem remains. I added the
> >> appropriate pool accounts for the sgm and prd users in users.conf (as I
> >> described previously...) but the problem remains. Currently, the site
> >> suffers from Unspecified Grid Manager errors, but other queues (like
> >> see) work ok. At first, it seemed like an SSH CE - WN problem, but even
> >> by deleting and re-creating the /etc/ssh/ssh_known_hosts the problem is
> >> still there.
> >
> >Did you try this:
> >
> > su - opssgm001 # whatever the account is called
> > echo date | qsub -q your_queue_for_ops
> >
> >Check this Wiki page:
> >
> > http://goc.grid.sinica.edu.tw/gocwiki/Unspecified_gridmanager_error
> >
--
Kyriakos Ginis
Software Engineering Laboratory
National Technical University of Athens
|