Hi Chris,
Thanks for the response. In fact I had put in a GGUS ticket to the relevant
people ("GridView/Availabilities") and the response was:
"The CMS availability report is generated using Gridview numbers and not ACE
numbers (This is mentioned in the report itself). Gridview looks only at the
LCG CEs and not the CREAM CEs as ACE does. This is the reason for the
perceived discrepancy. The behaviour is correct."
It was after reading this I noticed the top of the availability report
clearly states it uses the Gridview availability.
I think the phrase in their response "The behaviour is correct" may be
technically true, but is not quite what is wanted.
I had picked our CMS availabilities for October as they clearly showed the
effect.
Regards
Gareth
-----Original Message-----
From: Testbed Support for GridPP member institutes
[mailto:[log in to unmask]] On Behalf Of Chris Brew
Sent: 29 November 2011 09:51
To: [log in to unmask]
Subject: Re: retiring lcg ce
Hi,
CMS Certainly takes the CreamCEs into account but (currently, I think) still
has the logic
lcg-CE AND CreamCE AND SRMv2 instead of (lcg-CE OR CreamCE) AND SRMv2
So if you have an lcg-CE in GOCDB it has to be available but as soon as you
remove it, it ceases to be a problem.
However the powers-that-be were quite surprised by that logic and are happy
to correct anyone hit by it until it's fixed.
From my experience I would remove it from GOCDB and the site-bdii at the
point you start draining it. The services running jobs on it should still be
able to find it but the SAM tests and new jobs won't.
Yours,
Chris.
> -----Original Message-----
> From: Testbed Support for GridPP member institutes [mailto:TB-
> [log in to unmask]] On Behalf Of Gareth Smith
> Sent: 29 November 2011 09:34
> To: [log in to unmask]
> Subject: Re: retiring lcg ce
>
> Hi,
>
> Just thought to add some of our experience from the Tier1 regarding the
> issues of availability when retiring the lcg CEs.
>
> Catalin (who has done the work) put the lcg CEs into a downtime in the
> GOC DB while they were drained out. He then removed them from the GOC
> DB.
>
> For the Tier1 the WLCG management board gets reports of our
> availabilities for the OPS VO and all four of the LHC experiments VOs.
> However, they treat their availability calculations differently.
> - The OPS VO availabilities use the newer "ACE" calculations which take
> into account CREAM CEs.
> - The LHC VOs use the older GridView availability which does not know
> about CREAM CEs.
> (This is stated at the top of the availability plots - only I had never
> noticed it until pointed out recently).
> The upshot was that when we decommissioned our last LCG CE we took a
> big hit on our availabilities for the LHC VOs, but our OPS availability
> stayed up. I had a look at the CERN availabilities stored on their
> repository at:
> https://espace.cern.ch/WLCG-document-
> repository/ReliabilityAvailability/Forms/AllItems.aspx?RootFolder=%2FWL
> CG-document-repository%2FReliabilityAvailability&
> For Tier-2s this only seems to hold the OPS VO availabilities which use
> the ACE calculations. I don't know if they are the only ones you
> (Tier2s) need to worry about.
>
> One other note: The problem cleared up when the lcg CE was deleted from
> the GOC DB. The (older) 'Gridview' availability then had no lcg CEs to
> take into account, and as far as I can see bases our availability on
> only the SRM availability.
>
> Finally: Once a service has been deleted from the GOC DB its display
> drops all references to it from its history as well. I.e. you no longer
> see the downtimes declared for that service in the past. There does not
> seem to be a way that I could see in the GOC DB to retire a service at
> a particular date.
>
> Regards
> Gareth
|