JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for UKHEPGRID Archives


UKHEPGRID Archives

UKHEPGRID Archives


UKHEPGRID@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

UKHEPGRID Home

UKHEPGRID Home

UKHEPGRID  September 2007

UKHEPGRID September 2007

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Minutes of the 273rd GridPP PMB meeting

From:

Tony Doyle <[log in to unmask]>

Reply-To:

Tony Doyle <[log in to unmask]>

Date:

Thu, 20 Sep 2007 11:58:06 +0100

Content-Type:

MULTIPART/MIXED

Parts/Attachments:

Parts/Attachments

TEXT/PLAIN (20 lines) , 070917.txt (1 lines)

Dear All,

     Please find attached the latest weekly GridPP Project Management 
Board Meeting minutes. The latest minutes can be found each week in:

http://www.gridpp.ac.uk/php/pmb/minutes.php?latest

as well as being listed with other minutes at:

http://www.gridpp.ac.uk/php/pmb/minutes.php

Cheers, Tony
________________________________________________________________________
Prof. A T Doyle, FInstP FRSE                       GridPP Project Leader
Rm 478, Kelvin Building                      Telephone: +44-141-330 5899
Dept of Physics and Astronomy                  Telefax: +44-141-330 5881
University of Glasgow                   EMail: [log in to unmask]
G12 8QQ, UK                 Web: http://ppewww.physics.gla.ac.uk/~doyle/
________________________________________________________________________


GridPP PMB Minutes 273 - 17th September 2007 ============================================ Present: Tony Doyle, Sarah Pearce, Roger Jones, Stephen Burke, David Britton, Dave Newbold, Steve Lloyd, Tony Cass, Robin Middleton, John Gordon, Andrew Sansum, Norman McCubbin, Suzanne Scott (Minutes) Apologies: David Kelsey, Jeremy Coles, Peter Clarke, Glenn Patrick, Neil Geddes 1. CASTOR Report to STFC ========================= It was reported that STFC had requested a CASTOR status report prior to the OC. This related to the idea of a 'Plan B' and who would pick up the tab in the relationship with CERN. It was asked if the status of a 'Plan B' could be expanded by AS - a document had been circulated some time ago. It was agreed that wording was crucial - the tenor of the report had to be accurate. It was suggested that the question was 'funding' orientated, with the assumption that if there was a problem it was solvable by extra funding. TD noted that this was not realistic. AS asked how much effort should be included towards ensuring that CASTOR ran ok. NM noted that from the ATLAS point of view at RAL, things were more positive with CASTOR now than they had been around three months ago. TD advised that this was not apparent from the document as it stood, which dealt with earlier problems. DN noted that the CMS picture was also not all bad. It was agreed that the PMB would not respond immediately to this request. The preliminary response should be that all documents are currently being prepared for the OC and statements would be available from October 1st. The content of the preliminary response should state that the situation is a rapidly-changing one and we would prefer to provide the most up-to-date information come the time. AS agreed that he could put an Appendix in the Tier-1 Report which addressed CASTOR specifically, detailing the situation as at end Sep/beg Oct. AS asked what the slant of the document should be? JG noted that the major point to emphasise was the upgrade, delivered to CMS and ATLAS, but with problems outstanding. TD suggested that the progress over the summer should be provided and where things stood by the end of September - it was important to provide reassurance to STFC and the OC that progress was being made and problems addressed successfully. TD noted that this time the Experiments had been directly invited separately, viz RJ, DN, Raja Nandakumar, and Dave Evans. NM noted that by that time we would have an official statement. DB advised that Kim Dollimore (STFC) had asked for an update on CASTOR and software issues, and that a response was requested within the next week. NM noted that the text should be signed-off by the Experiments. DB noted that a brief response stating a changing situation and putting a detailed response into abeyance until end September might not be acceptable to STFC. DN noted that there existed a normal route for information to be disseminated and he was reluctant to commit himself at this stage. JG advised a reassuring statement. DB countered that it was problematic to provide a bland reassuring statement at this stage if things were likely to become worse - rather, he would suggest saying that there are some problems currently but these are being worked on, and that there had been a marked improvement. TD noted that for CMS things were ok, for ATLAS less so, and LHCb had not really started. DN noted that CMS have problems at present but have made great progress, it was better to say that things were working, and wait a week or so to ensure that this was demonstrably working in CSA'07. DB summarised that there are, have been, and will be problems with CASTOR, the question was whether or not things were improving. DN advised two metrics: 1) a trajectory to run the organisation; 2) the OC were there to review the project via milestones, and at present we were not meeting them, as noted in the Project Map. It was agreed that DB would send an email to Kim Dollimore in the light of the above discussion. [note: e-mail sent shortly after the meeting] 2. MoU/SLA Status ================== TD noted that we have to ensure the elements of this are all present - what needs to be added, especially in Ops? TD and SL had reviewed page 1 - it was asked whether GOC should be part of the MoU. On p2, as it was, it was noted that the WLCG tag doesn't affect GridPP, which refers to LCG instead as it incorporates everything that is required. JG agreed to send a phrase to TD in relation to WLCG. For p3, it was noted that the MoU would be in force until 2011. For No3 (Deployment Board), it was agreed that the DB would be the organisation through which this document passes. The Tier-1 and Tier-2 representatives needed to be checked, there were five of them. The phrase 'Tier Centre' was queried - this was a combination of Tier-1 and Tier-2, but needs to be more clearly defined as the phrase doesn't mean a real organisation. It was suggested that the following wording might be suitable: '... each Tier shall nominate ...'. It was agreed that No4 (Hardware Support) was fine. For No5 (Resources) the plan was to have a table at the end which would define these. For No6 (Availability), the service level agreement exceptions were the VO boxes. For No7 (Monitoring), the APEL accounting will monitor in regard to software. JG will provide suitable wording to TD - it was understood that the sites couldn't be told what they should be monitoring, a wider statement was required. DB noted that the wording should state that the Tier Centres agree to provide monitoring information - this is a more general statement. TD agreed - appropriate wording would note that the Tiers simply agree to monitor information. For No8 (Target Shares), TD noted that we need to ensure that site-specific target shares and storage add-up to what is intended and aggregated across the UK. For No9 (Software), TD noted that the DB needs to be at the centre of this - a set of grid software releases must be deployed. TD noted that the Tiers should agree to implement and update software. The wording as it stood was fine. For No10 (Network Connectivity) it was noted that the GridMon boxes may not be supported later on. In 10.2 it was agreed to omit the word 'software'. There was no change to No11 (Security). For No12 (Management), the words 'and technical' before meetings (in 12.1) could be added, but this wasn't a problem area. The reference to the Deployment Board should be removed from 12.3 as it is a duplication. There was no change to No13 (Extension). In No14 (Termination) there were minor corrections. Regarding Appendix A, the SLA was not about individual services, but rather related to high-level services. The wording had been taken from the CERN wording, but could be changed if other wording was felt more appropriate. It was agreed to be as specific as possible with regard to the Experiments, and refer to them explicitly where appropriate. Regarding Appendix B, a statement was required regarding when Operations are available and what they will do. Did this relate to the GridPP MoU or the EGEE/LCG environment? There was also no information included relating to helpdesks or call centres - it was felt better not to list them, just advise that they are available. Regarding Appendix C and Staffing, it was noted that the Tier-1 will provide high-level service, broken down into units, described by whom and on what basis the services will be available. There were 2 FTE incorporated into the Incident Response Unit (IRU). All statements here can be changed via the Deployment Board. It was noted that Appendices had not been included within the body of the document but were highlighted in yellow. NM and TD would confer regarding wording - relating to the Hardware Support Staff. On p19 there was a statement of provision over the three-year term - this needs to be indicative. TD advised that for this process it was preferable to keep the Tier-1 and Tier-2 together. Later on, the Appendices can be separated. It was felt preferable to change 'Appendix' to 'Annexe'. It was agreed that when the MoU is to be circulated, it will be submitted as two documents: the MoU and MoU Annexe. On p19 regarding Hardware, it was asked when within the year should resources be available to meet the MoU? Following discussion the PMB agreed to refer to April. It was agreed that NM would assist with the wording of the Appendices (Annexes). JG would provide updates regarding Operations. TD would circulate the finished version of the MoU mid-week to the existing Tier-1 and Tier-2 Boards. NM left the meeting at this point. 3. Preparations for the OC =========================== It was reported that documents need to be prepared by Monday 1st October, in order to submit all docs including Exec Summary by Thursday 4th October. Update was as follows: #115 Executive Summary [PMB] (inc. summary of available performance metrics) - ongoing. #116 ProjectMap Report [DB] (up to 07Q2) - ongoing. #117 Resource Report [DB] (transition) - ongoing. #118 LCG Status Report [TC] - ongoing. #119 EGEE Report [RM/JG] - RM reported that a report on EGEEIII and EGEE I plus a status report on SA1 were now underway. #120 Deployment Report [DK] - ongoing. #121 MSN Report [RM] - ongoing. #122 Applications Report [RJ] - RJ reported that after speaking with DB in Nottingham, he would be doing a continuity report (based on the Quarterly Report sent to Dave) and this would be included in the credibility gap document. #123 User Board Report [GP] - TD reported that a UB version had been circulated by GP, but inputs were required. DN and RJ to update their sections. #124 Tier 1/A Report [AS] - ongoing. #125 Tier 2 Report [SL] - ongoing. #126 Dissemination Report [SP] - ongoing. ----requested---- #127 GridPP3 Plan [DB] - ongoing. #128 Credibility Gap [DB] - RJ noted that he had relevant inputs relating to the credibility gap and shortfalls - these would be available shortly. #129 Disaster/Scenario Planning (inc. OPN network example) [TD] - input had been received from DB and a contacts list was currently being updated. TD asked that SB take the token for this document during this week - input from SB was relevant to the applications interfaces side. SB said he would review the current version. 4. Review of Risk Register =========================== TD noted that the Risk Register required to be reviewed. The Project Map on the webpage was used. DB noted that CASTOR needed to be explicitly flagged. The proposal was for a new assessment of R5 applications and a red to flag CASTOR - this would need added first and then other reds on the list would be dealt with. DB noted that risks 1-4 did not require much discussion. R3 however, with respect to minimal contingency - a longer term view was required and a flag to reflect the reduced contingency: likelihood is 4, impact 2 = 8 (not too critical yet). R5 on CASTOR - this should go under applications and problems with the Grid: across the board is 3 (over the next 6 months), impact is 3 = 12. If it definitely fails during the next six months, we wouldn't be pursuing it at this point. TD noted the problem with R5 was about gLite. DB agreed - R5 and R10 should reflect CASTOR and gLite. Were there any changes to R9 re Scientific Linux? RM asked where we capture the dependability of SL4? This involved a different interpretation of risk. Did the PMB want to change the assessment of gLite in R10? DN advised that it was still quite a high risk: 3 and 2. TD noted that the impact had gone down, as evidenced by the service challenges. DN agreed, but highlighted problems with FTS and SRMs, although the risk had been ameliorated by way of backup plans - they had responded to the risk. DB summarised that there had been nothing contentious down to R15. R15 related to the risk going up regarding maintenance of software/documentation and design/shared knowledge. DB noted that transition planning had been done and the risk assessed, but complacency was to be avoided. TD advised that if it remains a problem, increase it: likelihood 4, impact 3. DB noted that in relation to middleware, there were funding concerns, and this needs to be rationalised. R15 for MSN is 2 and 3 = 6. R28 referring to work in other countries, was passed-over for the moment. DB noted that for R40, a lack of future funding would reflect R15 - it was agreed to leave the application ones at 5 and reduce the middleware to 3 and 2, however if the PMB were not happy about the current status, leave it at 3 and 3 - the latter was agreed. 5. AOCB ======== Regarding grants, it was reported that 4 remained to be processed despite the funding being required this month. TD reported that there had been a formal handover within STFC from Deborah Millar to Trish Mullins - this had now happened, and emails should be sent to Trish Mullins from now on. The PMB pages refer to Trish, as she is now the Programme Manager. Deborah Millar was thanked for all of her support of GridPP over the years. RM reported that the EGEE III proposal was now in its final form and was due to be submitted on Thursday. STANDING ITEMS ============== SI-1 Dissemination Officer's Report ------------------------------------ SP reported that the AHM had gone well. DB reported that he had attended but that there had been confusion regarding his talk and timings, through lack of communication; it had been fairly quiet generally. SP agreed that there was a general lack of interest in attending and numbers were down at the stand. Next year there would be a different location (Edinburgh) which might help attendance. SP reported that the BA day had gone well - DB had manned the demo stand on the Wed afternoon - there were 4 or 5 people present from GridPP and they had been fully occupied individually talking to people for over 2 hours. DB reported that on Friday he had given a talk (within the LHC section) to about 400 people. He had been contacted by BBC York but had been unable to re-establish contact with them. RJ reported that the usability session at the AHM had been interesting and had worked well - there were still things we could do within these sessions. He had been disappointed at the lack of engagement at the AHM, there had been a noticeable lack of input. SP reported that Neasan O'Neill was working on a news item re the AHM and BA festival. SP was doing a GridTalk proposal for Thursday as prep for EGEE '07. SI-2 Tier-1 Manager's Report ----------------------------- No report this week, see CASTOR discussion above. SI-3 Production Manager's Report --------------------------------- SB reported that Edinburgh were having difficulties with GPFS and SL4 upgrade. Greig Cowan had suggested a storage workshop, this had come up again at CHEP. SI-4 LCG Management Board Report --------------------------------- TD reported that he had been preparing a Disaster Scenario Planning Report and had been unable to attend the meeting, but the Minutes were available at https://cern.ch/twiki/bin/view/LCG/MbMeetingsMinutes SI-5 Documentation Officer's Report ------------------------------------ There were no items. REVIEW OF ACTIONS ================= 261.13 DK to progress receipt of ScotGrid feedback. This was now done; info was to be updated on the website. TD would forward to SL the email he had received that had gone from JG to GS. 269.4 GP to circulate an email once the situation with LHCb banning sites who have migrated to SL4, was resolved. GP confirmed following the meeting that this was now done. An email had been received from Joel Closier - this was circulated to the PMB. 272.2 SL & TD to draft an initial document on MoU/SLA for circulation (not a PMB Doc). Done, item closed. 272.7 SL to remind RJ and DN that agreement is awaited from London and NorthGrid regarding extension of Tier-2 MoUs for 7-month period. This was now done but there was no outcome. Can RJ sign-up to the agreement? SL is awaiting his reply. AMBLESIDE ACTIONS ================= DTeam/PMB (1. Deployment Board to meet formally in GridPP3 as described.) Ongoing. 2. Regarding site availability, SL to plot his data and JC to highlight it via wiki. This should now be a PMB action - transferred to PMB listing. 3. Regarding installation of software (and Condor), PMB to draft a letter to Cambridge giving a list of problems and ask about resolutions. TD to do this - he would do this after the OC meeting. (4. JJ to document pros/cons of resilient dCache.) Ongoing. Discussion Session: Action: Mingchao Ma to send extant security policies to SL for discussion at the Tier-2 Board. These need to be re-approved prior to being available as links for users on the website. SL to discuss security policies with DK - DK needs to clarify the status of these. ACTIONS AS AT 17.09.07 ====================== 250.4 RJ, DN, GP, TD to meet to integrate experiment requirements of Tier-2s going to Tier-1 - sites are aware of requirements but discussion still has to take place. It was noted that this issue is not high priority. A meeting is to take place with Barney Garrett. 252.3 RM has now received inputs for his one-page summary regarding the transition of each of the existing Middleware areas from GridPP2 to GridPP2+ to GridPP3 - this to go to DB. This was to be done by Friday GridPP2+ 8th June but is still ongoing. This is now urgent. 261.4 DB to look through the input in detail in relation to GGUS problems. DB currently working on grants issues and quarterly reporting - this would be dealt with as soon as possible. 263.2 JG to further investigate the lack of ability to pass job requirements to the batch system and report-back (Tier-2 review issue). JG will raise this through the GDB. 267.3 SP to begin organising metrics for GridPP3, beginning with update and review of existing milestones and metrics, plus review of WLCG requirements. SP to co-ordinate with DB, AS and JC. It was agreed that the high-level view should be prepared for the OC relating to what has been agreed, and how we are working towards this - SP to present a few slides. 268.1 RJ to prepare a one-page table for ATLAS (regarding Tier-3 resources) that could be used as a template for all the Experiments. Following this, action on GP, RJ, and DN to come up with a short proposal. It was noted that RJ had drafted something but this was not yet completed. 271.2 Re CERN-RAL OPN link breakage, RJ to provide an analysis of what the consequences would be to Experiments for a one-day break, a three-day break, a five-day break, etc. The outcome of these need to be assessed for disaster scenario planning. 272.1 SL to prepare a Tier-2 Report for the OC. 272.3 PMB to email TD notes/suggestions in relation to disaster scenario planning. 272.4 AS to check the current Tier-1 disaster recovery plan and circulate the existing version to the PMB. 272.5 DB to prepare a 'credibility gap' document. 272.6 SL & TD to discuss the MoU during this week and provide a draft version by next week to go to the Tier-1 and Tier-2, prior to submission to OC. 272.8 TD to supply a letter of support for GridTalk, on behalf of GridPP. This to go to SP. [Done following the meeting]. 273.1 JG to send a suitable phrase regarding WLCG to TD for inclusion in the MoU. [Done following the meeting]. 273.2 JG to send suitable wording to TD regarding Monitoring of Hardware Resources at sites (MoU). 273.3 NM and TD to confer regarding the wording of Hardware Support Staff (MoU). 273.4 NM to assist with wording of Appendices (MoU Annexes). 273.5 JG to provide updates regarding Operations (MoU). 273.6 TD to circulate finished version of MoU mid-week prior to the OC. 273.7 Action from Ambleside: regarding site availability, SL to plot his data and JC to highlight it via wiki. 273.8 DK to clarify the status of Security Policy documents. 273.9 TD to draft letter to Cambridge regarding Condor deployment problems and proposed resolutions. INACTIVE CATEGORY ================= 247.2 RJ to get further information from ATLAS regarding use of Grid for testing of PANDA, and report-back. 251.1 TD to raise the issue of memory vs CPU cost at the MB [in order to work out what the requirement was between 1GB and 2GB memory per core]. 253.1 AS has commenced work on the report on data integrity at Tier-1, in relation to implementation of checksums. Ongoing, AS hopes to complete this by end August. 261.5 JC and dTeam to carry out a survey on sites' experiences of GGUS, when possible to organise. This was pending but would be addressed after the holiday period. It was noted that a Questionnaire was required. 271.1 PMB to examine the issue of fibre breakage and outages, CERN-RAL OPN link, in one year's time, when actual data on breakages is available. Due date would be September '08. 271.3 Re CERN-RAL OPN link breakage and backup generally, PC to oversee the issue and collate info so that the PMB have something to revisit in one year's time. Due date September '08. Next week's PMB would take place at 1.00 pm on Monday 24th September. It was noted that it was the September weekend holiday in Glasgow - SP would take Minutes and TD would participate from home.

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

February 2024
January 2024
September 2022
July 2022
June 2022
February 2022
December 2021
August 2021
March 2021
November 2020
October 2020
August 2020
March 2020
February 2020
October 2019
August 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
November 2017
October 2017
September 2017
August 2017
May 2017
April 2017
March 2017
February 2017
January 2017
October 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
July 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
October 2013
August 2013
July 2013
June 2013
May 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
2006
2005
2004
2003
2002
2001
2000


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager