Ah.. I can see John's already replied.
It could be some (obscure perhaps?) monitor check that was not reported
to us via a ticket. Suggest either:
a) monitors that detect problems should send alerts/tickets or
b) sites to be informed of any monitor used to make rel/av tests which
need to be manually observed on a regular basis.
In any case, we were up and running - the estimates are likely to be
false ones.
Cheers,
Ste
On 2016-12-07 22:02, sjones wrote:
> Hi Jeremy,
>
> On 2016-12-07 16:52, Jeremy Coles wrote:
>> Please could: RHUL, Glasgow and Liverpool send me some brief text on
>> the difficulties encountered during the month.
>
> There were no significant difficulties in November at Liverpool
> regarding availability and reliability prior to a power cut in late
> November that wiped out our ARC/Condor CE, but we recovered within 6
> hours or so. Thus we did practically a full month of work at near 100%
> reliability & availability.
>
> My figures suggest we made 6.3 million hs06 hours of work for lhcb in
> Nov, which is indicative of (say) 99% uptime. It would not be possible
> if we were up only 33% of the time. I'll check this in the morning but
> this looks like a serious measurement error to me.
>
> To do that, I need to know; what is the basis of the 33% measurement,
> and who is responsible for making it?
>
> Cheers,
>
> Ste
|