JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for UKHEPGRID Archives


UKHEPGRID Archives

UKHEPGRID Archives


UKHEPGRID@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Monospaced Font

LISTSERV Archives

LISTSERV Archives

UKHEPGRID Home

UKHEPGRID Home

UKHEPGRID  September 2007

UKHEPGRID September 2007

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Minutes of the 273rd GridPP PMB meeting

From:

Tony Doyle <[log in to unmask]>

Reply-To:

Tony Doyle <[log in to unmask]>

Date:

Thu, 20 Sep 2007 11:58:06 +0100

Content-Type:

MULTIPART/MIXED

Parts/Attachments:

Parts/Attachments

TEXT/PLAIN (20 lines) , 070917.txt (1 lines)

Dear All,

     Please find attached the latest weekly GridPP Project Management
Board Meeting minutes. The latest minutes can be found each week in:

http://www.gridpp.ac.uk/php/pmb/minutes.php?latest

as well as being listed with other minutes at:

http://www.gridpp.ac.uk/php/pmb/minutes.php

Cheers, Tony
________________________________________________________________________
Prof. A T Doyle, FInstP FRSE GridPP Project Leader
Rm 478, Kelvin Building Telephone: +44-141-330 5899
Dept of Physics and Astronomy Telefax: +44-141-330 5881
University of Glasgow EMail: [log in to unmask]
G12 8QQ, UK Web: http://ppewww.physics.gla.ac.uk/~doyle/
________________________________________________________________________


GridPP PMB Minutes 273 - 17th September 2007
============================================
Present: Tony Doyle, Sarah Pearce, Roger Jones, Stephen Burke, David Britton,
Dave Newbold, Steve Lloyd, Tony Cass, Robin Middleton, John Gordon,
Andrew Sansum, Norman McCubbin, Suzanne Scott (Minutes)

Apologies: David Kelsey, Jeremy Coles, Peter Clarke, Glenn Patrick,
Neil Geddes

1. CASTOR Report to STFC
=========================
It was reported that STFC had requested a CASTOR status report prior to
the OC. This related to the idea of a 'Plan B' and who would pick up the
tab in the relationship with CERN. It was asked if the status of a 'Plan
B' could be expanded by AS - a document had been circulated some time ago.
It was agreed that wording was crucial - the tenor of the report had to be
accurate. It was suggested that the question was 'funding' orientated,
with the assumption that if there was a problem it was solvable by extra
funding. TD noted that this was not realistic. AS asked how much effort
should be included towards ensuring that CASTOR ran ok. NM noted that
from the ATLAS point of view at RAL, things were more positive with CASTOR
now than they had been around three months ago. TD advised that this was
not apparent from the document as it stood, which dealt with earlier
problems. DN noted that the CMS picture was also not all bad. It was
agreed that the PMB would not respond immediately to this request. The
preliminary response should be that all documents are currently being
prepared for the OC and statements would be available from October 1st.
The content of the preliminary response should state that the situation is
a rapidly-changing one and we would prefer to provide the most up-to-date
information come the time. AS agreed that he could put an Appendix in the
Tier-1 Report which addressed CASTOR specifically, detailing the situation
as at end Sep/beg Oct. AS asked what the slant of the document should be?
JG noted that the major point to emphasise was the upgrade, delivered to
CMS and ATLAS, but with problems outstanding. TD suggested that the
progress over the summer should be provided and where things stood by the
end of September - it was important to provide reassurance to STFC and the
OC that progress was being made and problems addressed successfully. TD
noted that this time the Experiments had been directly invited separately,
viz RJ, DN, Raja Nandakumar, and Dave Evans. NM noted that by that time
we would have an official statement. DB advised that Kim Dollimore (STFC)
had asked for an update on CASTOR and software issues, and that a response
was requested within the next week. NM noted that the text should be
signed-off by the Experiments. DB noted that a brief response stating a
changing situation and putting a detailed response into abeyance until end
September might not be acceptable to STFC. DN noted that there existed a
normal route for information to be disseminated and he was reluctant to
commit himself at this stage. JG advised a reassuring statement. DB
countered that it was problematic to provide a bland reassuring statement
at this stage if things were likely to become worse - rather, he would
suggest saying that there are some problems currently but these are being
worked on, and that there had been a marked improvement. TD noted that
for CMS things were ok, for ATLAS less so, and LHCb had not really
started. DN noted that CMS have problems at present but have made great
progress, it was better to say that things were working, and wait a week
or so to ensure that this was demonstrably working in CSA'07. DB
summarised that there are, have been, and will be problems with CASTOR,
the question was whether or not things were improving. DN advised two
metrics: 1) a trajectory to run the organisation; 2) the OC were there to
review the project via milestones, and at present we were not meeting
them, as noted in the Project Map. It was agreed that DB would send an
email to Kim Dollimore in the light of the above discussion.
[note: e-mail sent shortly after the meeting]

2. MoU/SLA Status
==================
TD noted that we have to ensure the elements of this are all present -
what needs to be added, especially in Ops? TD and SL had reviewed page 1
- it was asked whether GOC should be part of the MoU. On p2, as it was,
it was noted that the WLCG tag doesn't affect GridPP, which refers to LCG
instead as it incorporates everything that is required. JG agreed to send
a phrase to TD in relation to WLCG. For p3, it was noted that the MoU
would be in force until 2011. For No3 (Deployment Board), it was agreed
that the DB would be the organisation through which this document passes.
The Tier-1 and Tier-2 representatives needed to be checked, there were
five of them. The phrase 'Tier Centre' was queried - this was a
combination of Tier-1 and Tier-2, but needs to be more clearly defined as
the phrase doesn't mean a real organisation. It was suggested that the
following wording might be suitable: '... each Tier shall nominate ...'.
It was agreed that No4 (Hardware Support) was fine. For No5 (Resources)
the plan was to have a table at the end which would define these. For No6
(Availability), the service level agreement exceptions were the VO boxes.
For No7 (Monitoring), the APEL accounting will monitor in regard to
software. JG will provide suitable wording to TD - it was understood that
the sites couldn't be told what they should be monitoring, a wider
statement was required. DB noted that the wording should state that the
Tier Centres agree to provide monitoring information - this is a more
general statement. TD agreed - appropriate wording would note that the
Tiers simply agree to monitor information. For No8 (Target Shares), TD
noted that we need to ensure that site-specific target shares and storage
add-up to what is intended and aggregated across the UK. For No9
(Software), TD noted that the DB needs to be at the centre of this - a set
of grid software releases must be deployed. TD noted that the Tiers
should agree to implement and update software. The wording as it stood
was fine. For No10 (Network Connectivity) it was noted that the GridMon
boxes may not be supported later on. In 10.2 it was agreed to omit the
word 'software'. There was no change to No11 (Security). For No12
(Management), the words 'and technical' before meetings (in 12.1) could be
added, but this wasn't a problem area. The reference to the Deployment
Board should be removed from 12.3 as it is a duplication. There was no
change to No13 (Extension). In No14 (Termination) there were minor
corrections.

Regarding Appendix A, the SLA was not about individual services, but
rather related to high-level services. The wording had been taken from
the CERN wording, but could be changed if other wording was felt more
appropriate. It was agreed to be as specific as possible with regard to
the Experiments, and refer to them explicitly where appropriate.
Regarding Appendix B, a statement was required regarding when Operations
are available and what they will do. Did this relate to the GridPP MoU or
the EGEE/LCG environment? There was also no information included relating
to helpdesks or call centres - it was felt better not to list them, just
advise that they are available. Regarding Appendix C and Staffing, it was
noted that the Tier-1 will provide high-level service, broken down into
units, described by whom and on what basis the services will be available.
There were 2 FTE incorporated into the Incident Response Unit (IRU). All
statements here can be changed via the Deployment Board. It was noted
that Appendices had not been included within the body of the document but
were highlighted in yellow. NM and TD would confer regarding wording -
relating to the Hardware Support Staff. On p19 there was a statement of
provision over the three-year term - this needs to be indicative. TD
advised that for this process it was preferable to keep the Tier-1 and
Tier-2 together. Later on, the Appendices can be separated. It was felt
preferable to change 'Appendix' to 'Annexe'. It was agreed that when the
MoU is to be circulated, it will be submitted as two documents: the MoU
and MoU Annexe. On p19 regarding Hardware, it was asked when within the
year should resources be available to meet the MoU? Following discussion
the PMB agreed to refer to April. It was agreed that NM would assist with
the wording of the Appendices (Annexes). JG would provide updates
regarding Operations. TD would circulate the finished version of the MoU
mid-week to the existing Tier-1 and Tier-2 Boards. NM left the meeting at
this point.

3. Preparations for the OC
===========================
It was reported that documents need to be prepared by Monday 1st October,
in order to submit all docs including Exec Summary by Thursday 4th October.
Update was as follows:

#115 Executive Summary [PMB] (inc. summary of available performance metrics)
- ongoing.
#116 ProjectMap Report [DB] (up to 07Q2) - ongoing.
#117 Resource Report [DB] (transition) - ongoing.
#118 LCG Status Report [TC] - ongoing.
#119 EGEE Report [RM/JG] -
RM reported that a report on EGEEIII and EGEE I plus a status report on
SA1 were now underway.
#120 Deployment Report [DK] - ongoing.
#121 MSN Report [RM] - ongoing.
#122 Applications Report [RJ] -
RJ reported that after speaking with DB in Nottingham, he would be doing a
continuity report (based on the Quarterly Report sent to Dave) and this
would be included in the credibility gap document.
#123 User Board Report [GP] -
TD reported that a UB version had been circulated by GP, but inputs were
required. DN and RJ to update their sections.
#124 Tier 1/A Report [AS] - ongoing.
#125 Tier 2 Report [SL] - ongoing.
#126 Dissemination Report [SP] - ongoing.
----requested----
#127 GridPP3 Plan [DB] - ongoing.
#128 Credibility Gap [DB] -
RJ noted that he had relevant inputs relating to the credibility gap and
shortfalls - these would be available shortly.

#129 Disaster/Scenario Planning (inc. OPN network example) [TD] -
input had been received from DB and a contacts list was currently being
updated. TD asked that SB take the token for this document during this
week - input from SB was relevant to the applications interfaces side. SB
said he would review the current version.

4. Review of Risk Register
===========================
TD noted that the Risk Register required to be reviewed. The Project Map
on the webpage was used. DB noted that CASTOR needed to be explicitly
flagged. The proposal was for a new assessment of R5 applications and a
red to flag CASTOR - this would need added first and then other reds on
the list would be dealt with. DB noted that risks 1-4 did not require
much discussion. R3 however, with respect to minimal contingency - a
longer term view was required and a flag to reflect the reduced
contingency: likelihood is 4, impact 2 = 8 (not too critical yet). R5 on
CASTOR - this should go under applications and problems with the Grid:
across the board is 3 (over the next 6 months), impact is 3 = 12. If it
definitely fails during the next six months, we wouldn't be pursuing it at
this point. TD noted the problem with R5 was about gLite. DB agreed - R5
and R10 should reflect CASTOR and gLite. Were there any changes to R9 re
Scientific Linux? RM asked where we capture the dependability of SL4?
This involved a different interpretation of risk. Did the PMB want to
change the assessment of gLite in R10? DN advised that it was still quite
a high risk: 3 and 2. TD noted that the impact had gone down, as
evidenced by the service challenges. DN agreed, but highlighted problems
with FTS and SRMs, although the risk had been ameliorated by way of backup
plans - they had responded to the risk. DB summarised that there had been
nothing contentious down to R15. R15 related to the risk going up
regarding maintenance of software/documentation and design/shared
knowledge. DB noted that transition planning had been done and the risk
assessed, but complacency was to be avoided. TD advised that if it
remains a problem, increase it: likelihood 4, impact 3. DB noted that in
relation to middleware, there were funding concerns, and this needs to be
rationalised. R15 for MSN is 2 and 3 = 6. R28 referring to work in other
countries, was passed-over for the moment. DB noted that for R40, a lack
of future funding would reflect R15 - it was agreed to leave the
application ones at 5 and reduce the middleware to 3 and 2, however if the
PMB were not happy about the current status, leave it at 3 and 3 - the
latter was agreed.

5. AOCB
========
Regarding grants, it was reported that 4 remained to be processed despite
the funding being required this month.

TD reported that there had been a formal handover within STFC from Deborah
Millar to Trish Mullins - this had now happened, and emails should be sent
to Trish Mullins from now on. The PMB pages refer to Trish, as she is now
the Programme Manager. Deborah Millar was thanked for all of her support
of GridPP over the years.

RM reported that the EGEE III proposal was now in its final form and was
due to be submitted on Thursday.

STANDING ITEMS
==============

SI-1 Dissemination Officer's Report
------------------------------------
SP reported that the AHM had gone well. DB reported that he had attended
but that there had been confusion regarding his talk and timings, through
lack of communication; it had been fairly quiet generally. SP agreed that
there was a general lack of interest in attending and numbers were down at
the stand. Next year there would be a different location (Edinburgh)
which might help attendance. SP reported that the BA day had gone well -
DB had manned the demo stand on the Wed afternoon - there were 4 or 5
people present from GridPP and they had been fully occupied individually
talking to people for over 2 hours. DB reported that on Friday he had
given a talk (within the LHC section) to about 400 people. He had been
contacted by BBC York but had been unable to re-establish contact with
them. RJ reported that the usability session at the AHM had been
interesting and had worked well - there were still things we could do
within these sessions. He had been disappointed at the lack of engagement
at the AHM, there had been a noticeable lack of input. SP reported that
Neasan O'Neill was working on a news item re the AHM and BA festival. SP
was doing a GridTalk proposal for Thursday as prep for EGEE '07.

SI-2 Tier-1 Manager's Report
-----------------------------
No report this week, see CASTOR discussion above.

SI-3 Production Manager's Report
---------------------------------
SB reported that Edinburgh were having difficulties with GPFS and SL4
upgrade.

Greig Cowan had suggested a storage workshop, this had come up again at CHEP.

SI-4 LCG Management Board Report
---------------------------------
TD reported that he had been preparing a Disaster Scenario Planning Report
and had been unable to attend the meeting, but the Minutes were available
at
https://cern.ch/twiki/bin/view/LCG/MbMeetingsMinutes

SI-5 Documentation Officer's Report
------------------------------------
There were no items.

REVIEW OF ACTIONS
=================
261.13 DK to progress receipt of ScotGrid feedback. This was now done;
info was to be updated on the website. TD would forward to SL the email
he had received that had gone from JG to GS.

269.4 GP to circulate an email once the situation with LHCb banning sites
who have migrated to SL4, was resolved. GP confirmed following the
meeting that this was now done. An email had been received from Joel
Closier - this was circulated to the PMB.

272.2 SL & TD to draft an initial document on MoU/SLA for circulation (not
a PMB Doc). Done, item closed.

272.7 SL to remind RJ and DN that agreement is awaited from London and
NorthGrid regarding extension of Tier-2 MoUs for 7-month period. This was
now done but there was no outcome. Can RJ sign-up to the agreement? SL
is awaiting his reply.

AMBLESIDE ACTIONS
=================
DTeam/PMB
(1. Deployment Board to meet formally in GridPP3 as described.) Ongoing.

2. Regarding site availability, SL to plot his data and JC to highlight it
via wiki. This should now be a PMB action - transferred to PMB
listing.

3. Regarding installation of software (and Condor), PMB to draft a letter
to Cambridge giving a list of problems and ask about resolutions. TD
to do this - he would do this after the OC meeting.

(4. JJ to document pros/cons of resilient dCache.) Ongoing.

Discussion Session:
Action: Mingchao Ma to send extant security policies to SL for discussion
at the Tier-2 Board. These need to be re-approved prior to being available
as links for users on the website. SL to discuss security policies with DK -
DK needs to clarify the status of these.

ACTIONS AS AT 17.09.07
======================
250.4 RJ, DN, GP, TD to meet to integrate experiment requirements of
Tier-2s going to Tier-1 - sites are aware of requirements but discussion
still has to take place. It was noted that this issue is not high
priority. A meeting is to take place with Barney Garrett.

252.3 RM has now received inputs for his one-page summary regarding the
transition of each of the existing Middleware areas from GridPP2 to
GridPP2+ to GridPP3 - this to go to DB. This was to be done by Friday
GridPP2+ 8th June but is still ongoing. This is now urgent.

261.4 DB to look through the input in detail in relation to GGUS problems.
DB currently working on grants issues and quarterly reporting - this would
be dealt with as soon as possible.

263.2 JG to further investigate the lack of ability to pass job
requirements to the batch system and report-back (Tier-2 review issue).
JG will raise this through the GDB.

267.3 SP to begin organising metrics for GridPP3, beginning with update
and review of existing milestones and metrics, plus review of WLCG
requirements. SP to co-ordinate with DB, AS and JC. It was agreed that
the high-level view should be prepared for the OC relating to what has
been agreed, and how we are working towards this - SP to present a few
slides.

268.1 RJ to prepare a one-page table for ATLAS (regarding Tier-3
resources) that could be used as a template for all the Experiments.
Following this, action on GP, RJ, and DN to come up with a short proposal.
It was noted that RJ had drafted something but this was not yet completed.

271.2 Re CERN-RAL OPN link breakage, RJ to provide an analysis of what the
consequences would be to Experiments for a one-day break, a three-day
break, a five-day break, etc. The outcome of these need to be assessed
for disaster scenario planning.

272.1 SL to prepare a Tier-2 Report for the OC.

272.3 PMB to email TD notes/suggestions in relation to disaster scenario
planning.

272.4 AS to check the current Tier-1 disaster recovery plan and circulate
the existing version to the PMB.

272.5 DB to prepare a 'credibility gap' document.

272.6 SL & TD to discuss the MoU during this week and provide a draft
version by next week to go to the Tier-1 and Tier-2, prior to submission
to OC.

272.8 TD to supply a letter of support for GridTalk, on behalf of GridPP.
This to go to SP.
[Done following the meeting].

273.1 JG to send a suitable phrase regarding WLCG to TD for inclusion in
the MoU.
[Done following the meeting].

273.2 JG to send suitable wording to TD regarding Monitoring of Hardware
Resources at sites (MoU).

273.3 NM and TD to confer regarding the wording of Hardware Support Staff
(MoU).

273.4 NM to assist with wording of Appendices (MoU Annexes).

273.5 JG to provide updates regarding Operations (MoU).

273.6 TD to circulate finished version of MoU mid-week prior to the OC.

273.7 Action from Ambleside: regarding site availability, SL to plot his
data and JC to highlight it via wiki.

273.8 DK to clarify the status of Security Policy documents.

273.9 TD to draft letter to Cambridge regarding Condor deployment problems
and proposed resolutions.

INACTIVE CATEGORY
=================
247.2 RJ to get further information from ATLAS regarding use of Grid for
testing of PANDA, and report-back.

251.1 TD to raise the issue of memory vs CPU cost at the MB [in order to
work out what the requirement was between 1GB and 2GB memory per core].

253.1 AS has commenced work on the report on data integrity at Tier-1, in
relation to implementation of checksums. Ongoing, AS hopes to complete
this by end August.

261.5 JC and dTeam to carry out a survey on sites' experiences of GGUS,
when possible to organise. This was pending but would be addressed after
the holiday period. It was noted that a Questionnaire was required.

271.1 PMB to examine the issue of fibre breakage and outages, CERN-RAL OPN
link, in one year's time, when actual data on breakages is available.
Due date would be September '08.

271.3 Re CERN-RAL OPN link breakage and backup generally, PC to oversee
the issue and collate info so that the PMB have something to revisit in
one year's time. Due date September '08.

Next week's PMB would take place at 1.00 pm on Monday 24th September.
It was noted that it was the September weekend holiday in Glasgow - SP
would take Minutes and TD would participate from home.

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

February 2024
January 2024
September 2022
July 2022
June 2022
February 2022
December 2021
August 2021
March 2021
November 2020
October 2020
August 2020
March 2020
February 2020
October 2019
August 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
November 2017
October 2017
September 2017
August 2017
May 2017
April 2017
March 2017
February 2017
January 2017
October 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
July 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
October 2013
August 2013
July 2013
June 2013
May 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
2006
2005
2004
2003
2002
2001
2000


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager