I can't speak to the UK, but ... When it comes to monitoring, all I want is: a) something that emails me automatically when something goes wrong and b) that has a link for further information in it. Basically nagios. Don't make me check a webpage, it never ever works and I am speaking from dire experience here. And don't include a generic link either where I then have to guess which of the n settings I have to check/change to figure out where the error comes from. CMS is a guilty of that as Atlas. Try running tests on a site that is not a member of the experiment (i.e. a T3) and see if this site can understand the error and you'll do just fine. Bonus points for a site being able to initiate a test (to check something has been fixed), but that's really a bonus. Cheers, Daniela On 17 September 2013 14:01, Alessandra Forti <[log in to unmask]>wrote: > I sent this to Jeremy thinking he would put it in agenda but he told me > he wasn't there eirther. > > > -------- Original Message -------- Subject: Re: Ops meeting @ 11am Date: > Tue, 17 Sep 2013 10:01:05 +0100 From: Alessandra Forti > <[log in to unmask]> <[log in to unmask]> CC: Jeremy Coles > <[log in to unmask]> <[log in to unmask]> > > Hi Jeremy, > > as there is the engineer to repair the central switch this morning I > don't know if I can make it to the meeting or if I can be reliably there. > > SL6: > > * Bristol postponed > * Glasgow and Lancaster are now in test with atlas queues > * Manchester has brought forward the upgrade 2 weeks and we have > declared a week downtime from the 30th of September untill the 7th of > October. > * Birmingham is done. > > * There are problems with the java voms-proxy-info again affecting atlas > jobs on sites that limit the memory to 3GB (few UK sites are doing > that). Atlas is thinking of replacing voms-proxy-info with arcproxy. I'm > giving a talk at the ADC meeting later today to decide what to do. > https://ggus.eu/ws/ticket_info.php?ticket=97230 > > Monitoring: > > I started a discussion about nagios on the sites monitoring > consolidation list. Only Jeff Templon replied. We need a UK point of > view. If sites show no interest I don't blame the monitoring people for > going their way. If we don't speak they are right to take this decisions > almost without consultation. > > cheers > alessandra > > > > > > On 17/09/2013 09:38, Jeremy Coles wrote: > > Dear All > > > > The agenda for today's ops meeting is available at http://indico.cern.ch/conferenceDisplay.py?confId=273350. The plan is to review the GDB updates from last week and check again on the SL6 status (especially to bring out any issues or concerns). > > > > Pete has kindly agreed to chair this week - though if Pete is unable to connect from RAL, please could someone else from the core ops team take control. As Matt mentioned in the tickets email, there will not be an ops meeting next week due to GridPP31 (https://www.gridpp.ac.uk/gridpp31/). > > > > For minutes the list is Mark=6 Wahid=8 Daniela=7 Kashif=7 Matt=7 Chris=7 Alessandra=7 Pete=7 Rob=7 Ewan=7 Brian=7. > > > > regards, > > Jeremy > > > -- > Facts aren't facts if they come from the wrong people. (Paul Krugman) > > > > > -- Sent from the pit of despair ----------------------------------------------------------- [log in to unmask] HEP Group/Physics Dep Imperial College London, SW7 2BW Tel: +44-(0)20-75947810 http://www.hep.ph.ic.ac.uk/~dbauer/