JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for UKHEPGRID Archives


UKHEPGRID Archives

UKHEPGRID Archives


UKHEPGRID@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

UKHEPGRID Home

UKHEPGRID Home

UKHEPGRID  November 2010

UKHEPGRID November 2010

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Minutes of the 405th GridPP PMB meeting

From:

David Britton <[log in to unmask]>

Reply-To:

David Britton <[log in to unmask]>

Date:

Mon, 15 Nov 2010 10:19:09 +0000

Content-Type:

multipart/mixed

Parts/Attachments:

Parts/Attachments

text/plain (67 lines) , 101108.txt (251 lines)

Dear All,

Please find attached the GridPP Project Management Board Meeting minutes
for the 405th meeting.

   The latest minutes can be found each week in:

http://www.gridpp.ac.uk/php/pmb/minutes.php?latest

as well as being listed with other minutes at:

http://www.gridpp.ac.uk/php/pmb/minutes.php

Cheers, Dave.

-- 
________________________________________________________________________
Prof. David Britton                          GridPP Project Leader
Rm 480, Kelvin Building                      Telephone: +44 141 330 5454
School of Physics and Astronomy              Telefax: +44-141-330 5881
University of Glasgow                 EMail: [log in to unmask]
G12 8QQ, UK
________________________________________________________________________













































GridPP PMB Minutes 405 (08.11.10) ================================= Present: Dave Britton (Chair and minutes), Sarah Pearce, Tony Doyle, Jeremy Coles, Andrew Sansum, Steve Lloyd, Roger Jones, Glenn Patrick, John Gordon, Dave Colling. Apologies: Dave Kelsey, Pete Clarke, Neil Geddes, Tony Cass, Robin Middleton 1. GridPP26 ============ DB reported that a provisional booking had been made at a hotel in Hove for Mon 28th March to Thu 31st March. Rooms and breakfast were £50/night, including doubles and twins for single occupancy. Furthermore, conference rooms for the PMB and Storage meetings had been offered at £50/day. The main meeting would be at the University of Sussex, though there were issues about transport (may need to organise some buses) and about power and wireless access at the conference venue. Places for the conference dinner were being looked into. The hotel was a possibility, but a restaurant may be better. 2. Installed Capacity ======================= In advance of an MB discussion at CERN tomorrow, JG had raised the issue of installed capacity reporting by the Tier-2s in the UK. JC had circulated a spreadsheet comparing the gstat values; the actual capacity from the latest quarterly report; the Tier2-GridPP MOU numbers; and the wLCG pledge. ALthough there were some issues the big picture is fine: The per-Tier-2 gstat values satisfied the wLCG pledge, except for 5% under from ScotGrid. This was not a real shortfall as installed capacity is there, but related to reporting. At the individual site level, there were various discrepancies with the GridPP-MoU either due to kit currently being installed or due to DPM/Storm reporting issues. TD noted that transcription errors were an issue - we need something that automatically pulls figures out of the quarterly reports. JC did have something, but quarterly report format changes as extra columns are added for specific issues each quarter. 3. Quarterly Reporting Status [SP] ================================== After a flurry of activity this morning (presumably in response to the agenda item!) all reports had been received except those from RM and JG. 4. EPSRC call [SP/JG] ===================== JG had circulated the EPSRC call which was directed at EPSRC-funded subjects so was not directly applicable to GridPP. Nevertheless, there was some scope in the dissemination/outreach area for some kind of joint proposal. Neasan had talked to Catherine (EGI) and would talk with the NGS. There may be some scope for minor GridPP involvement here. 5. Security Statement [DB] ========================== JC had raised this issue last week, the following statement was iterated upon over the last few days: The GridPP PMB encourages sites to make decisions on security related matters in accordance with their own site's security policy and the common wLCG/EGI/GridPP security policy (https://wiki.egi.eu/wiki/SPG:Documents), taking into account the advice received from their own site security team, the information provided by the GridPP security team, and in consultation with other sites. The PMB acknowledges that security responses may differ from site-to-site, reflecting different institutional policies, grid architectures, configurations and installed packages. Ultimately, each site must weigh the risk vs benefit of continuing to provide a service. The risk analysis must consider the severity of the threat; the time-frame of the exposure; the risk that lack of response would cause to other sites, services and the infrastructure; and the ability of a site to monitor and respond. GridPP encourages sites to consider each incident objectively and does not wish to influence, in either direction, the outcome of that consideration. That is, GridPP does not encourage sites to take on a higher level of risk than they feel comfortable with in order to preserve service, nor does GridPP encourage sites to shutdown service prematurely to eliminate small risks. STANDING ITEMS ============== SI-1 Tier-1 Manager's Report ----------------------------- Fabric: 1) FY09 procurements: - SL09 tranche completed acceptance test, very few problems encountered. It is likely that we will formally accept the hardware this week. 2) FY10 procurements - Disk tender - orders placed. Delivery late November. - CPU tender - orders placed. Delivery late November and December. - Various small system purchases being made. 3) Robotics An intervention was made on the tape robot on 2nd November to address an overheating problem. Unfortunately this was only partially sucessful and a further intervention will be required. Service: Overall a better week operationally than recent weeks. 1) Summary of operational issues is at:     https://www.gridpp.ac.uk/wiki/Tier1_Operations_Report_2010-11-03 2) CASTOR On Monday (1st November) the Atlas SRMs crashed repeatedly. This was triggered by the use of a particular SRM command that checked the status of files recalled from tape. (There had been a change in the Atlas software that exposed this problem.) On Tuesday morning the Atlas SRMs were upgraded to fix the problem. So far this is looking good. There is now an SIR of the previous weeks problems on the LHCB instance: https://www.gridpp.ac.uk/wiki/RAL_Tier1_Incident_20101026_LHCb_SRM_Bad_T URL_and_Outage   A change will be scheduled to move all the disk servers to 64 bit in order to fix the checksum problem on the LHCB instance (pending since the upgrade). Our plan is to do the LHCB disk servers on Wednesday then open negotiations with ATLAS and CMS to schedule work on theirs. We plan to upgrade the LHCB SRMs (capacity not SRM release) in order to address possible performance issues. We are working on a schedule to carry out this work. Issues around the exact configuration to be deployed remain to be agreed but we hope to get them in this week before LHCB reprocessing starts. No problems have emerged from the Gen instance upgrade nor do we believe it is likely that recent problems with the LHCB SRMs relate to the upgrade. We have therefore concluded that it will be safe to proceed with upgrades to CMS and ATLAS. The schedule is now:    # Upgrade CMS - Tuesday to Thursday 16-18 November.    # Upgrade ATLAS - Monday to Wednesday 6 - 8 December. SI-2 ATLAS weekly review & plans --------------------------------- RJ reported that a data loss at Lancaster was worrying as it looked very like an earlier incident at Glasgow. AS asked whether it related to the a generation of controllers that they were concerned about at RAL? Andrew will follow up with Peter or Matt. Because of SRM worries ATLAS didn't move to PD2P last week - plan to do today. SI-3 CMS weekly review & plans ------------------------------- DC reported that CMS was fine; there had been some issues with Tier-1 SAM test due to load. The CASTOR upgrade had been moved back by a week or so. SI-4 LHCb weekly review & plans -------------------------------- GP reported as follows: 1) RAL Tier 1. Reasonable running over last week (since problems of previous weekend) although load has been somewhat lower. Investigations continue, but a lot of SRM hits appear to come from FTS. Plan is to upgrade LHCb SRM machines to increase performance. LHCb reprocessing due to start mid-November Ð so aim to upgrade/test before this. 2) UK Tier 2. Some problems with shared area at Bristol and Birmingham. Issue with queue length parameters at UCL causing jobs to be killed. SI-5 Production Manager's Report --------------------------------- JC reported as follows: 1) There have been problems with the WMSes in the UK over the last week and this has reflected in the Nagios test results. The underlying problem is not really understood at the moment (see for example the RAL ticket https://gus.fzk.de/ws/ticket_info.php?ticket=63912 and the Glasgow ticket https://gus.fzk.de/ws/ticket_info.php?ticket=63931), Jobs enter the waiting state and never complete. This has affected the SL test jobs too. 2) An estimate from Alastair Dewhurst suggests that there is of order 33TB of Òdark dataÓ in ATLAS LOCALGROUPDISK. The current policy is to have 20% of a T2 disk allocated to this spacetoken. There is currently no deletion policy for this area which is of concern to many sites Ð but ultimately an ATLAS problem! 3) We currently have a problem with our ROD-COD communication as our regional operations list is unsubscribed to the COD list due to email bounce problems (this has arisen due to the change from a CERN based list to an egi.eu one). 4) A number of GridPP sites are being picked up by Pakiti as having nodes still vulnerable to a recently announced vulnerability. SI-6 LCG Management Board Report --------------------------------- No meeting since last week. SI-7 Dissemination Report -------------------------- SP noted CHEP news item on GridPP website, and a note was being written on lessons to be drawn from the stand and conference. SL noted that the CERN@school VO had been established. REVIEW OF ACTIONS ================= 398.12 TD/DB to make renewed efforts to engage someone at Glasgow to tackle GridMon and to have access transferred in order to ensure the instances were up-to-date and running ok - DB would insist on a meeting with Mark Leese for a handover. To be done by the end of GridPP3. It was decided that this action had been done by setting up the meeting. Progress would now be monitored in the normal way (quarterly reports). 402.1 JC/JG to address the issue of ticket workflow in the UK in relation to NGS/NGI, to clarify that the support process is: tickets were ending in dead ends. JC was meeting this pm to discuss. 402.2 JC/JG to provide status report on EGI/NGI Service Level Agreements in the context of GridPP agreeing with the level of service provided, ensuring that it is as GridPP requires. JC and JG were meeting tomorrow to discuss. Some of this might have input into the GridPP4 MoU. 404.4 DB to provide a draft statement for the Minutes which should assist sites in dealing with expectations on them in relation to risk strategies and work required. DB had done this. ACTIONS AS OF 08.11.10 ====================== 384.6 TD/JC to take the lead on the 'GridPP to NGI' document that addresses the forward-moving technical and other issues from a GridPP perspective. JC was gathering info. It was noted that the recipient was likely to be Dave Wallom. Deadline of late November for discussion. This should be on the F2F Agenda for 9th December meeting. 397.1 AS to provide a high-level summary of the Disaster and Business Continuity Plan for input to the next OC meeting - by November 15th latest - and also provide a web link to further more detailed documents. 398.6 DC to provide updated LondonGrid MoU. DC reported that the meeting had happened, the LondonGrid MoU had been discussed, DC would incorporate comments. 398.7 DK to check that all is up-to-date in terms of GridPP Security Policies - email DB. If there are any issues, DK to let DB know. DK reported that the GridPP Security Policy phase was ongoing at present, however other policies had been approved by LCG. DK advised that EGI formal signoff was awaited, then the GridPP pages would be updated. 398.10 RJ/Graeme Stewart to provide urls of the place(s) where info is located re ATLAS site tests and measurements (so that sites understand what they're being measured on). 398.13 DB to consider how to evolve the User Board into a useful meeting in the future, DB to initiate in the timeframe between now and GridPP4. This should be on the F2F Agenda for 9th December meeting. 400.2 JC to confirm that priorities have been documented for the major experiments for recovering files from disk servers. 400.4 SL to co-ordinate changing the current GridPP MoU towards an MoU for GridPP4. 402.1 JC/JG to address the issue of ticket workflow in the UK in relation to NGS/NGI, to clarify that the support process is: tickets were ending in dead ends. 402.2 JC/JG to provide status report on EGI/NGI Service Level Agreements in the context of GridPP agreeing with the level of service provided, ensuring that it is as GridPP requires. 403.2 RJ to broadcast the move to ATLAS adaptive data placement at RAL, specifically for PD2P only, via ATLAS and GridPP standard channels. 404.1 DB to send round requests for papers from the PMB for the forthcoming OC meeting. 404.2 SP to circulate requirements relating to the OC meeting, for discussion at the PMB on 15th November. 404.3 JC/JG to document the process for setting up a new VO in the UK and make it available in the appropriate places. The next PMB would take place on Monday 15th November at 12:55 pm.

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
February 2024
January 2024
September 2022
July 2022
June 2022
February 2022
December 2021
August 2021
March 2021
November 2020
October 2020
August 2020
March 2020
February 2020
October 2019
August 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
November 2017
October 2017
September 2017
August 2017
May 2017
April 2017
March 2017
February 2017
January 2017
October 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
July 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
October 2013
August 2013
July 2013
June 2013
May 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
2006
2005
2004
2003
2002
2001
2000


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager