On 04/15/2016 04:28 PM, Winnie Lacesso wrote:
> Is it common to NFS-mount /etc/grid-security/vomsdir?
The /etc/grid-security/vomsdir directory is referred to when security
decisions are made. It contains LSC (list-of-certificate) files, which
are used to verify that a certain server is trusted. Since we use ARGUS
for centralised worker-node security, then we don't need to share it.
Even when we used local worker-node security, we did not share the
vomsdir, although it should be OK to do so. We still use local node
security on DPM, I think, as it is not rigged up to work with ARGUS (I
don't think DPM _can_ be rigged up to use ARGUS, but I'm not sure about
that.) In any-case, we don't share vomsdir on DPM either, but it should
be possible to do so I would have thought. And it would serve to keep
things consistent so perhaps it's a good idea.
> Apparently updating voms package (not happen often, but) wants to write to
> /etc/grid-security/vomsdir
Yes. I can see voms pkg claims it:
# rpm -ql voms.x86_64
/etc/grid-security
/etc/grid-security/vomsdir
/usr/lib64/libvomsapi.so.1
/usr/lib64/libvomsapi.so.1.0.0
/usr/share/doc/voms-2.0.12
/usr/share/doc/voms-2.0.12/AUTHORS
/usr/share/doc/voms-2.0.12/LICENSE
/usr/share/voms
/usr/share/voms/vomses.template
> My colleague says he's seen the voms pkg update FAIL due to this.
> So when updating voms pkg, /etc/grid-security/vomsdir has to be
> unmounted, update the voms pkg, then remount it.
>
> What would happen to new jobs that arrive on WN in that time? Fail if
> /etc/grid-security/vomsdir is, erm, an empty mountpoint?
>
> Or, is it: "don't update voms pkg when WN running jobs - must be drained."
What follows is my vague opinion on how things work - I could be miles
off but here goes.
As far as I know, jobs are authenticated, authorised and mapped on the
condor head-node (the same applies for torque) prior to hitting the
worker-node. They arrive at the worker-node already as the right user.
You can verify this by seeing that the condor_shadow tasks on the
headnode are run as the proper user, e.g.
prdatl26 30722 19300 0 Apr17 ? 00:00:00 condor_shadow
So there is limited use for vomsdir on the worker node, esp. when using
ARGUS. Indeed, at our site, the worker nodes are badly configured with a
partial set of vomsdir (just LHC exps) yet it still works fine for all
VOs (the bad config is an artefact of our puppet setup which I must
clean-up some day!!!)
So, in summary and storage notwithstanding, my theory is: (a) only
glexec uses voms on the worker nodes, (b) if using ARGUS you don't need
correct vomsdir on workernodes, and (c) when sharing vomsdir, jobs
arriving don't matter if vomsdir unavailable because they are already in
the right user. So only concern is if a job already running tries to
switch user using glexec while vomsdir unmounted. In such a case, theory
is that job fails verification and dies. So my workaround would be: on
each node one at a time, quickly unmount vomsdir, quickly do voms pkg
update, quickly remount vomsdir!
Cheers,
Ste
--
Steve Jones [log in to unmask]
Grid System Administrator office: 220
High Energy Physics Division tel (int): 43396
Oliver Lodge Laboratory tel (ext): +44 (0)151 794 3396
University of Liverpool http://www.liv.ac.uk/physics/hep/
|