Print

Print


Hi Gareth,

On 05/02/14 13:43, Gareth Roy wrote:
> This may be a dumb suggestion but just throwing out ideas… if you’ve
> already checked this then feel free to ignore. If your local tests are
> passing and the infrastructure looks right then it _could_ be a problem
> with the Argus policy for glexec (I’ve just see the same error reported
> at another site when we were trying to get glexec up and running).
>
> If you do a “pap-admin lp” on your Argus server you should get a list of
> all the currently viable policies. If you check in the section that is
> headed:
>
> resource "http://authz-interop.org/xacml/resource/resource-type/wn" {
>      obligation "http://glite.org/xacml/obligation/local-environment-map" {
>      }
>
> Your looking for something that looks like:
>
>          rule permit { pfqan="/ops/Role=pilot/Capability=NULL" }
>          rule permit { pfqan="/ops/Role=pilot" }
>          rule permit { pfqan="/ops/Role=NULL/Capability=NULL" }
>          rule permit { pfqan="/ops” }
>
> Another example and complete Argus instructions can be found here
> https://www.gridpp.ac.uk/wiki/Argus_Server

Our Argus policy for this is:
resource "http://authz-interop.org/xacml/resource/resource-type/wn" {
     obligation "http://glite.org/xacml/obligation/local-environment-map" {
     }

     action "http://glite.org/xacml/action/execute" {
         rule permit { pfqan="/atlas/Role=pilot" }
         rule permit { pfqan="/atlas/Role=lcgadmin" }
         rule permit { pfqan="/atlas/Role=production" }
         rule permit { pfqan="/atlas" }
         rule permit { pfqan="/ops/Role=pilot" }
         rule permit { pfqan="/ops/Role=lcgadmin" }
         rule permit { pfqan="/ops" }
         rule permit { vo="dteam" }
     }
}

Which matches the wiki mostly, but I don't have the 'Capability=NULL' 
part. Could I ask what that part relates to and should I add that in?

> I note you were looking for NOT AUTHORIZED in the logs and not seeing
> anything but I’ve had Argus fail _slightly_ differently in the case of
> glexec. If you look in the audit log you might see things that instead
> of Permit are all Nulls, or NotApplicable rather than seeing a NOT
> AUTHORIZED in the process.log or in /var/log/messages. It’s not helpful
> but indicative of the authz process failing to find something.

So on the Argus server everything is showing as Permit at the moment, 
even around the time when the Nagios tests are failing.

I know when running a manual test with my own cert as part of the dteam 
vo, that Argus would not allow me to run glexec, until I added the 
policy above to permit vo=dteam -- so I think at least partially the WNs 
and the argus server are working in *some* cases.

> p.s Another thought is I see in your last set of emails you made sure
> pilops mappings were right on the worker nodes, did the same thing
> happen on the Argus server? It uses the grid map files to know which
> pool accounts to map DN’s to so they need to be available on the Argus
> server as well… something that always bites me :)

I did need to update the worker nodes to have the same 
/etc/yaim/users.conf and groups.conf as the Argus and Cream servers - is 
that what you mean here? The /etc/grid-security/grid-mapfile is present 
on WNs, argus and cream and is the same on all too.

Thanks!
Matt