Print

Print


Dear All,

It will be worth touching upon this discussion during our ops meeting this week, so I forward this for background reading.

Regards,
Jeremy




Begin forwarded message:

From: Peter Solagna <[log in to unmask]>
Subject: [Noc-managers] Discussion in WLCG regarding the use of information system
Date: 29 May 2016 17:35:37 BST
To: NGI Operations Centre managers <[log in to unmask]>

Dear colleagues,

for the benefit of those who did not attend the OMB meeting, and also
to provide more information I would like to share with you some
information about a topic that is currently being discussed in the WLCG.

LHC VOs are not using te BDII in the application workflows, and some
site reported in a survey that the configuration of the information
system of the storage elements is particularly problematic, and
requires more effort than for other services to keep it properly
configured [1].

WLCG Operations are considering to tell their sites that publishing
the SE in the BDII is not mandatory anymore, from the WLCG point of
view. For the moment this is not triggering any changes in the current
EGI policies.

This anyways would be applied *only* to SEs that are supporting
only LHC VOs, and that are WLCG Tier-1 or Tier-2.

EGI policies require that the SE is published in the BDII, first of
all because it is required by the monitoring infrastructure. Therefore
if any site decides to stop publishing the SE in the BDII, they must
consequently mark it as "local" in the GOCDB and therefore remove it
from the EGI production infrastructure.

At the moment we do not have an estimation of the number of SEs that
can be potentially effected by the WLCG proposal, but considering that
this must be limited only to storage elements supporting only LHC VOs
I would expect the numbers to
be limited.

There are a number of consequences that come with the removal of a
storage element from EGI:
- Site will not get support by EGI to control CA version, neither in
upgrade campaigns or security vulnerability checks. And this is
affecting WLCG VOs who will continue to use the SE after it will be
removed from EGI.
- EGI helpdesk support (2nd level support) will find more difficult to
follow-up tickets regarding problems with non-monitored services
- For the NGI and the infrastructure, it reduces the total capacity
available, and makes more difficult to justify investments

[1] The problems reported in the information system are affecting only
dCache  in a specific configuration.

What are the next steps:

This proposal - limited to a specific set of storage elements - internal to
WLCG is not yet officially approved by WLCG. EGI Operations already
raised concerns about the implications directly to WLCG, and of
course we reported the constrains of EGI policies. Mid-June there will
be another WLCG meeting where this will be further discussed.

Currently no EGI policies are planned to change, if this will happen
it will have to go
through OMB, as usual, following ufor example p a request from the NGIs.

My opinion would be to suggest NGIs to discourage their site managers
to remove any SEs from the infrastructure, as long as it will be used
in production, regardless of the VOs that are using it. I believe that
the theoretical gain is very small compared to the loss.

From a formal point of view there are no EGI operational policies that
prevent a site manager to remove a service from production (following
the right process), but I think that NGIs - if considered appropriate
- can suggest sites to
follow or not WLCG guidelines, my understanding is that WLCG will
allow sites to turn off the information system without mandating it.

Apologies for the long email.

Please, let us know if there are any questions.

Thanks
Regards
Peter

--
Peter Solagna
EGI.eu - Senior Operations Manager
email: [log in to unmask]
skype: peter.solagna.egi
Mobile: +31(0)630373070
_______________________________________________
Noc-managers mailing list
[log in to unmask]
https://mailman.egi.eu/mailman/listinfo/noc-managers