Print

Print


Greetings all,

Please see below for why most sites CREAM-CEs are CRITICAL for APEL
not publishing.

Guess/expect: he needs a grid certificate in browser to view gridppnagios 
page quoted, & to join tb-support - confirmed?

If the latter is not true, could someone "in charge of" (?) tb-support 
contact him with help to join tb-support?

---------- Forwarded message ----------
Date: Tue, 4 Nov 2014 12:50:20 +0000
From: [log in to unmask]
To: [log in to unmask], [log in to unmask]
Cc: [log in to unmask], [log in to unmask]
Subject: RE: APEL problem upstream?

Hi Winnie,

The data containing the results of the Apel.Sync test is sent via the message
broker. This is calculated by some older software which has a message broker
hard-coded into it. That message broker went down so the results of the
Apel.Sync tests were not updated - leading to the warnings. Our newer
software library looks at the BDII for its list of brokers so is able to
retry several message brokers. We aim to rewrite this test software using our
newer library.

We should be more vigilant to problems with this message broker in the future
and proactively change its configuration. However, I am not sure that the
behaviour of this test - where a lack of data from central repository
triggers an alarm at the sites - is correct. I will try to raise this.

The affected message broker is now up again. The test will re-run at around
4-5pm today and send updated Apel.Sync test data. I am unable to view the
link that you included - perhaps I should? Finally, my previous attempts to
join the TB-Support list have failed. Please could you forward this response
to the list so that other sites are updated?

Many thanks and I apologise for the inconvenience,
Stuart Pullinger
APEL Team Leader

-----Original Message-----
From: Winnie Lacesso [mailto:[log in to unmask]] 
Sent: 04 November 2014 08:59
To: Coveney, Adrian (STFC,RAL,SC)
Cc: Meredith, David (STFC,DL,SC); Pullinger, Stuart (STFC,RAL,SC)
Subject: RE: APEL problem upstream?


Bonjour!

Thanks for replying! If you look at this page :

https://gridppnagios.physics.ox.ac.uk/nagios/cgi-bin/status.cgi?hostgroup=node-APEL&style=overview

most UK CREAM-CEs show the exact same CRITICAL apel error - not all.

--