Good. Well done.
From: Testbed Support for GridPP member
institutes [mailto:[log in to unmask]] On Behalf Of Kashif Mohammad
Sent: 21 November 2012 16:56
To: [log in to unmask]
Subject: Re: Possible problem with Nagios
Hi John
I was keeping eye one that and it didn’t update during GOCDB problem. Fortunately we were using Imperial top bdii with Nagios configuration so Tier 1 outage didn’t affect much as for monitoring is concerned.
The only problem I have seen is that apart from RAL WMS’s, Glasgow WMS’s were also in error state probably because it may be using Tier1 top BDII. So a lot of nagios jobs stayed in waiting state before I changed
configuration to use just Imperial WMS.
Cheers
Kashif
From: Testbed Support for GridPP member
institutes [mailto:[log in to unmask]] On Behalf Of John Gordon
Sent: 21 November 2012 14:59
To: [log in to unmask]
Subject: Possible problem with Nagios
When GOCDB was operating from its failover site this morning, it looks like clients downloading information only got partial data. This resulted in many services not being monitored. GOCDB at RAL is now back
online and the full set of services is available for download. SAM has updated and everything is fine there.
I have looked at the UK MyEGI and while I see N/A bands recently I don't see any unexpected red that would affect availability. We may have been lucky and the UK Nagios did not update in the bad window. If it
did maybe Kashif could force an update instead of waiting six hours.
John
--
Scanned by iCritical.
--
Scanned by iCritical.