Dear All,
Please find attached the weekly GridPP Project Management
Board Meeting minutes. The latest minutes can be found each week in:
http://www.gridpp.ac.uk/php/pmb/minutes.php?latest
as well as being listed with other minutes at:
http://www.gridpp.ac.uk/php/pmb/minutes.php
Cheers, Tony
________________________________________________________________________
Tony Doyle, GridPP Project Leader Telephone: +44-141-330 5899
Rm 478, Kelvin Building Telefax: +44-141-330 5881
Dept of Physics and Astronomy EMail: [log in to unmask]
University of Glasgow Web: http://ppewww.ph.gla.ac.uk/~doyle
G12 8QQ, UK Video - IP: 194.36.1.33
________________________________________________________________________
GridPP PMB Minutes 216 - 5th June 2006
======================================
Present: Tony Doyle, Sarah Pearce, Roger Jones, Stephen Burke, David
Britton, David Newbold, Steve Lloyd, Robin Middleton, John Gordon, Jeremy
Coles, Peter Clarke, Suzanne Scott (Clerk)
Apologies: David Kelsey, Tony Cass
1. Review of appendix documents [all]
=====================================
See documents at:
http://www.gridpp.ac.uk/docs/gridpp3/
TD expressed his thanks to everyone for their work on the Appendix
Documents. A brief update was requested:
13.1 Experiment Hardware and Service Requirements Planning Document [DN]
DN reported that this was not a planning document but rather a
re-expression of hardware and service requirements. Comments/input would
be welcomed. It was agreed that this document should contain the 5-6
pages already extant in the proposal - it could be decided later what to
finally remove. It was agreed that DN would look at the Tier-1 and Tier-2
documents and modify his input - duplication was not a problem at this
stage. It was reported that there was not much left to complete, there
was good definition of service in all of the current documents which
included baseline services and availability requirements - information
could be cut and pasted as required. DN will prepare a half-page summary
for the CMS experiment and will request a half-page from RJ on ATLAS.
13.2 Middleware Support Planning Document [RM]
RM reported that the Middleware document is based on inputs from managers,
in turn based on a questionnaire relating to the template section. Input
had not yet been 'married-up' with the spreadsheet. A long-term number of
23 is being used to cover deployment, middleware and applications support.
It was asked whether security needed to be viewed as a priority? More
work was required regarding the rationalisation of language etc. TD
advised that the document should express all of the input requirements
from all areas. Final decisions could be made at the face-to-face
meeting.
13.3 Deployment Support Planning Document [DK]
DK had submitted his apologies and was not present. JC reported that this
document is a back-up to the one-page summary for the main document which
describes the full activity of the deployment team and operations. It was
agreed that JC would provide further input. It was also agreed that two
separate documents for middleware and deployment was preferable. It was
felt that 23 was a large number for middleware support, but it was
reported that the spreadsheet explains the breakdown within MSN support,
application interfaces, Tier-2 Co-ordinators and Operations Management,
support posts and service support. It was agreed that the text should be
amended to reflect this breakdown. RM will circulate the spreadsheet
concerned.
13.4 Tier-1 Planning Document [RAS]
It was reported that Andrew Sansum had originally been working to the
final oversight document deadline of 23 June, therefore the deadline of
next week would probably slip. It was noted that this document might be
late, but a deadline of 14 June was hoped-for.
13.5 Tier-2 Planning Document [SL]
SL reported that a 'skeleton' document was available but that some
information was out-of-date. The figures required to be up-dated with
information from the last few months. It was agreed that the extant
figures could stand as they relate to the requirements specified. It was
noted that there were different ways to distribute the resources in
Tier-2, which would give different answers. It was felt that an idea of
procedures was required in relation to how allocations are determined -
these need to be quantified. It was agreed to revert to the CB and
provide them with the first draft of allocations to regional centres. It
was understood that work remained to be done on Manpower Tables, and how
Tier-2 can handle service requirements from experiments: service levels
from system managers need to be described.
13.6 Management Planning Document [DB]
DB reported that version 2 had been circulated. This document proposes
structure defining the roles of the PMB and providing a table with role
and person identification. It was understood that key positions are
largely identical to what exists now. There was a discussion relating to
the Deployment Board, and opinions were sought. Regarding the User Board,
a Chair can be appointed for the duration of the project, and this may be
useful in terms of continuity. It was proposed that experiments should be
formally represented on the Collaboration Board, also end users, but it
was understood that this would depend upon the size of the experiments.
There was a brief discussion on Project Management and definition of risks
'up front'. The document will also provide a section on 'roles' and it
was understood that ideas need to be 'firmed-up' regarding Boards. The
PMB terms of reference need reviewed and updated.
13.7 From Production to Exploitation Context Document [JG]
JG reported that he had received comments regarding the 'Bigger Picture'
document and that work was ongoing. Middleware support issues were
outstanding as were issues relating to UK GridPP infrastructure and
European infrastructure. It was noted that by 2008 Middleware will not be
in a developed and stable state. OMMI-Europe were in touch with EGEE.
The document was coming together.
13.8 GridPP2 Resource Utilisation Document [SL]
SL reported that figures were final now until the end of the month, at
which time the next quarter's information would be available. This
document would then cover an 18-month period from the beginning of Jan 05
to end June 06.
13.9 Dissemination Report [SP]
It was asked whether this should be a separate document? It was agreed,
yes. It was also agreed to reduce the document in order to keep it short
and succinct. It was reported that all news items and press releases were
to be listed, in similar format to the 8-page document previously produced
for GridPP2. It was agreed to restrict the Report to eight pages in
length.
2. GridPP3 Document Planning [TD]
=================================
It was noted that the aim would be to review a version by next Monday's
PMB meeting (12.06.06). It was understood that the last date for release
to the Collaboration Board is 19th June, but that the CB wish to provide
their input before then. A deadline of this Friday (09.06.06) was agreed
following which, over the weekend, the document could be amalgamated. It
was agreed that all areas of work should be finalised this week with the
following week allocated to looking at the proposal. If all documents
could be received by Friday, this would enable a first version to be
discussed by the PMB on Monday.
It was reported that input had been received from Jon Butterworth (PPRP)
and that he was happy 'in principle' with the approach taken in the
GridPP2 proposal. It was agreed that TD should speak to Charlotte Jamieson
at PPRP regarding what is required for the final document for GridPP3.
(It was clear that a full breakdown of individual grants via Jes
submission would not be possible).
3. AOCB
=======
Regarding GridPP 16, it was noted that there had been responses to the
circulated email. It was reported that talks had been defined and User
Board and Deployment Board information was to be provided. There could be
between 60 and 100 delegates present on the day. Viglen had requested
time to speak on multicore technology trends and it was agreed that a
short 20-minute slot could be made available.
There was no other business.
Standing Items
==============
SI-1 News Items
------------------
SP reported that three news items were posted last week: a report on the
PPARC brokering meeting, completion of the Birmingham School competition,
and an item on RAL upgrade to gLite 3. It was reported that a news item
was being drafted regarding undergraduates at Cambridge who were running
LHCb software on the Grid. It was noted that the brochure for policy
makers had now gone to the printers, and a bid had been submitted for a
place on the e-science stand at SuperComputing 06.
SI-2 Production Manager's Report
-----------------------------------
1) Only Cambridge responded positively to the possibility of joining
GILDA. There was a general question from both the deployment team and
UKI meeting: Is this the way GridPP really wants to interact with
industry? It seemed others would only consider joining if there was a
clear financial benefit. It was understood that financial incentives
did exist with the PIPPS and Mini-PIPPS schemes as well as funding from
PPARC. It was agreed that GILDA was only one idea for an initial
testbed and that others were available.
2) Our Tier-2 sites have started deploying gLite 3.0.0. Our strategy was
to rollout to one site in each Tier-2 last week and then include others
this week (having learnt more about the problems). Our timeline is to
have sites upgraded by the end of June; however the many meetings this
month (Tier-2 workshop, EGEE operations workshop and GridPP16) may
impact this schedule. Sites that have done the basic upgrade
encountered only a few problems mostly related to procedures
(documentation being a little unclear). Sites may be upgraded by
mid-July, but probably not before due to meeting constraints.
3) The ops VO will be enabled at the same time as gLite 3.0.0. This is a
new VO mentioned at a previous PMB and required for future monitoring.
The SFTs will switch from the dteam to ops VO at the start of July.
4) The deployment & storage groups are looking at the question of CPU:
disk [kSI2k:TB] requirements. The ATLAS figure of 2:1 is unlikely to be
possible for most sites. It was agreed that this was not currently an
issue but would become more important as time went on.
5) We are still seeing a large number of jobs from ATLAS (condorg) and
LHCb. Around 3000 jobs are running at the moment for the UK, with an
average of 4,500 CPUs for job slots. It was agreed that RJ would
circulate a summary of what is planned for the ATLAS SC4 tests.
6) There is an ongoing issue with CMS software installation at the Tier-1
- various sources for the problem are being investigated.
7) The RAL production FTS server was upgraded to version 1.5 today.
8) There is a WLCG GDB meeting this week at CERN
(http://agenda.cern.ch/fullAgenda.php?ida=a057707). The main discussion
topics are going to be on: storage (interfaces and accounting), gLite
3.0 progress, VOBoxes and Tier-0, 1 and 2 relationships.
SI-3 LCG Management Board Report
-----------------------------------
See
https://twiki.cern.ch/twiki/bin/view/LCG/MbMeetingsMinutes
JG reported that there were two main issues: firstly, the accounting
summary - there was no real comment here except that the May figures were
required and a further report would be provided to the PMB when available.
Tier-2's disk storage was not straightforward and it was agreed that the
current webpage would need to be updated to include a breakdown by VO.
Allocated space was also an issue, and it was agreed that dedicated space
for experiments is required.
Secondly, regarding SAMD service, it was noted that if core services were
down then the site was down. Results for May (for Rutherford) were at 68%.
Most of the time, other sites are at 75 and 80% - it was understood that
our results are below average. It was reported that replica management
tests were failing and that time-out on dcache deployment at the Tier-1
affects this.
At the Management Board it was asked whether we were publishing end-points
for FTS servers? Yes, we are. Metrics and tests were being investigated
at the present time. It was agreed that JG would circulate information to
the PMB.
SI-4 Documentation Officer's Report
--------------------------------------
There was nothing to report.
Review of Actions
-----------------
197.1 It was noted that the estimate of July 2006 hardware for T2s was
still in progress.
203.2 It was agreed that SP would review the DN lists on the GridPP
website.
205.1 It was noted that identification of UB reps was ongoing.
205.2 Regarding performance metrics, it was suggested that information
from JG could be useful. It was agreed that JG would forward the document
relating to the recording mechanism for Tier-2 service levels.
205.3 It was noted that the document on procedures of the Grid
Vulnerability Group was still in progress.
205.6 Work on convergence with NGS was ongoing.
205.8 It was reported that Neil Geddes had attended the meeting, and
information was awaited.
209.3 It was agreed that JG would attend to the matter of EGEE matching
funders today.
210.2 It was agreed that TD would draft a paper this week relating to the
IEEE conference.
Actions as at 05.06.06
----------------------
197.1 SL to determine realistic estimate of July 2006 hardware for T2s, as
part of the update on T2 accounting.
203.2 SP to review DN lists on the GridPP website and look at how these
can be updated more regularly.
205.1 DN to identify the UB reps and to initate discussions on training
courses.
205.2 JC to update performance metrics over as long a period as possible.
205.3 DK to prepare short document on "Procedures of the Grid
Vulnerability Group".
205.6 All - reminder - to work (further) on convergence steps with NGS.
205.8 SP to report back to the PMB on interoperability project.
209.3 JG to prepare a list of EGEE matching funders [and TD to forward to
Deborah Miller and Janet Seed].
210.2 TD to organise taking part in the IEEE 2nd International conference
on e-science 4-6th December.
216.1 RJ to circulate a summary of what is planned for the ATLAS SC4
tests.
216.2 JG to circulate information to the PMB on metrics and tests relating
to FTS servers.
|