Dear All,
Please find attached the weekly GridPP Project Management
Board Meeting minutes. The latest minutes can be found each week in:
http://www.gridpp.ac.uk/php/pmb/minutes.php?latest
as well as being listed with other minutes at:
http://www.gridpp.ac.uk/php/pmb/minutes.php
Cheers, Tony
________________________________________________________________________
Tony Doyle, GridPP Project Leader Telephone: +44-141-330 5899
Rm 478, Kelvin Building Telefax: +44-141-330 5881
Dept of Physics and Astronomy EMail: [log in to unmask]
University of Glasgow Web: http://ppewww.ph.gla.ac.uk/~doyle
G12 8QQ, UK Video - IP: 194.36.1.33
________________________________________________________________________
GridPP PMB Minutes 227 - 4 September 2006
=========================================
Present: Tony Doyle, Stephen Burke, Dave Newbold, Peter Clarke, Jeremy
Coles, Glenn Patrick, Robin Middleton, John Gordon, David Britton, David
Kelsey, Steve Lloyd, Neil Geddes, Suzanne Scott (Minutes)
Apologies: Sarah Pearce, Roger Jones, Andrew Sansum, Tony Cass
1. Referee's Comments
======================
TD reported that two Referees' Reports had been received, one more
straightforward than the other. The PMB were asked to note especially the
comment that timescales and milestones were 'hard to find', and that PPRP
might think that this is an issue - was a metaplan required? Did anyone
receive added comments for Institution input? No-one had received any
comments. It was noted that in terms of programme plan, an outline of
timescales and milestones would probably have to be provided by the
meeting on 8 November. It was acknowledged that an attempt had been made
in the original draft to add-in high-level milestones per section, but
this had proved both problematic and unworkable. It would be possible at
this point to provide context and emphasise that when the project is
actually up and running, clear milestones and timescales are easier to
define for these effectively guide progress. It was agreed that TD would
make a statement referring back to the proposal pointing out that we are
fully in control of project management.
SL noted that in the last point regarding resources, where the question
was did we have enough manpower, Tier-2 is underfunded. TD agreed that
this statement should be made: that Tier-2 is underfunded and that the
resources are relatively inaccurate. We will be required to respond to
these points raised, on the day, and TD suggested that we highlight the
points with a few short slides giving our replies. This was agreed, and
the PMB went through each point in turn as follows:
a. is the science high quality?
------------------------------
PC suggested that the points made are generally 'true', and that we should
modify our response accordingly - however the comments were irrelevant in
the context of LHC startup. It was agreed to approach the Referee's
position but with a focus on the overall project goal. TD noted that GP
had suggested classifying the discussion along three lines or levels (see
email of 1st Sept) and we therefore need a set of slides relating to these
responses. TD agreed to write a set of draft slides this evening. DB was
not available. Responses were requested before Wednesday. The slides
were not to dominate the proposal but show that we have looked at the
Referees' comments.
TD suggested that general statements could be made by way of introduction
to the slides:
- we are at the forefront of Grid development
- we do participate with GGF re standardisation which we could adopt
- re OMII: UK versus EU: there is convergence between OMII Europe and EGEE.
We evaluated OMII and had no convergence at that time
- if we looked at this five years ago we might have gone in a different
direction - but it is too risky now to change goal
b. objectives of the proposal?
-----------------------------
It would be enough to re-quote the first sentence, the proposal objectives
are clearly stated. PPARC will market-test in the next few years, but
this is not relevant at present. We agree with the comment but not its
relevance today - also, companies may not exist at that time. NG's
comments could be used here to summarise the response.
c. management and programme plan
--------------------------------
There were no comments for this section. The key statement is that the
Management structure detailed in Section 10 is well-developed and
fit-for-purpose. This was agreed.
d. novelty, originality, and relevance
--------------------------------------
It was noted that the high-level points were not converging with
astronomers - and they are not converging with us. It was agreed that if
we were given astronomers' requirements then they can be incorporated.
This could be put forward as an offer, eg to astro applications. It was
noted that AstroGrid has not dealt with large volumes of data. We are
open to them if there is anything they wish to do and we are very willing
to work with them. It was confirmed that all UK experiments should be
able to use the Grid. It was agreed that TD would forward this to Andy
Lawrence and Nic Walton regarding AstroGrid - and that a joint
'aspirational' response could be made on the day.
e. relationship with other work in UK and abroad
------------------------------------------------
LIGO and GT4 could be emphasised here - a short section would do here as
there are no real issues of concern. We are agreeing with this statement
and are the lead on some of this work, we are also working at higher-level
regarding file transfer services.
f. reliable methods and techniques proposed
-------------------------------------------
NG noted that we already are semi-professional middleware providers -
perhaps this was not well-enough explained. They are talking about
middleware development. The first statement regarding 'in-house
development' we agree with, but we cannot outsource hardware. HP were
already looking at this with their partners LCG - to run at 100% is
challenging for any service provider to achieve. TD noted that the
statement by the Referee was not referring to hardware but to commercial
resource provision - we were a step ahead so do not need to discuss the
outsourcing of hardware. PPARC should try and get a level of cost.
There was a discussion regarding the Referee's intentions.
g. industrial relevance and potential for exploitation
------------------------------------------------------
Was this weak ground? The proposal did not focus on technology transfer
but rather linked with EGEE - this area may need improvement but it was
not proposed to spend a lot of time on it, it would be preferable to use
the space for a half-post request regarding Industrial Liaison - this
represented good value at that level.
h. viability
------------
We already have something that scales and works - this is a large-scale
'yes' but regarding quality of service, 'no'. We could re-converge
regarding standards and the LHC experiments will fill-in the gaps. There
was scalability of the GLite stack and large energy costs. It was noted
that we had examined the risks and discussed these with PPARC. It was
agreed to refer the meeting to the relevant section. This area had been
identified and an approach does exist.
i. planning
-----------
It was noted that the wording 'may have started' is quite weak. The
inclusion of 'particle physics' within a future EGI goes without saying.
j. service level
----------------
This was a difficult issue - the service was expected to operate at 98%.
The current status of Tier-1 achievement was 76% against a target of 88%.
It was asked whether the long-term will offer enhanced levels of
availability? This depends on definitions, the current measures in place
do not allow for 98% availability. It was agreed that if we can provide
the same level of availability as anyone else does, then that is
acceptable. If new ways of working and new tools were to be developed
then we could increase availability to 98%.
There appeared to be a focus on replica management tools, looked-on as
generic broad services, agreed at some level with SRB. This works at
lower level - we have compared both: alternatives like SRB have been
considered by CMS and BaBAR - communities do try and pool their efforts.
We cannot agree with the statement at the end: 'splitting the community
and preventing solutions emerging' - this is not true and we can emphasise
this in this section.
k. large number of Tier-2 sites
-------------------------------
DB commented that reducing the number of tier-2 sites reduces the leverage
we get. It is not clear that this saves us any money really. We could
also certainly rationalise staff and hardware to few sites but the net
cost would be the same and we would probably simply lever less kit and
effort. We would be more likely to have to pay explicit running costs if
the Tier-2 resources were larger lumps in fewer places. Service level
might be better but one of the key points about a grid is to address
reliability with redundancy so more sites can help as well as hinder.
SL noted that if we had to lump all the Tier-2 kit in 4-5 places we would
have to invest in some serious buildings and infrastructure to put it in.
In addition the utility costs of these would be completely exposed.
There were no comments to be made regarding 'storage management' and
'workload management'.
l. information and monitoring
-----------------------------
There was further development of RGMA for 'maintenance mode' or
'bug-fixing' - this needs to be re-stated. We have already gone through
the process where we are focussing on 'bug-fixing' - the middleware
appendix has possibly not been read? A timescale was needed for the move
into 'maintenance mode'.
It was agreed that we need to re-state the proposal in slides,
re-presenting it - it was not a development project.
m. past effectiveness of applicants
-----------------------------------
'exceptionally successful ... etc' - it was agreed to use this full quote
for a response.
There was no comment on 'sustainability of research team'.
n. quantify continuity and risk of not meeting it
-------------------------------------------------
It was agreed that at the first meeting at least a 7-month extension
should be approved. Following this first meeting, risk is high if they do
not respond in this manner to the GridPP2 continuation proposal. TD
advised that we also need to do a re-costing because of all the University
sites regrading of pay. This was appended to the proposal in a generic
statement - there is high-level agreement between the Research Councils
and the Universities to allow them to carry-out this process.
o. suitability of departments
-----------------------------
The focus was on Tier-2s - and this is back to SL's point (in k above);
there were no questions regarding Tier-1.
p. cost effectiveness, value for money
--------------------------------------
TD asked if there were any points. All calculations were done by the
experiments to ensure that GridPP2 meets its obligations - this was not
based on matched funding for EGEE but the project has got close to the end
point of contract and there are issues of both staff retention and
hardware. The next three years will see grid expansion: full detail is
given in the tables, we are not introducing any new information here.
The Reviewer asked about the high-level statements at the beginning of the
proposal - we are well-integrated into the community now, back-up slides
may be required here.
q. other notes
--------------
It was understood that de-bugging on the grid is difficult; we have not
attempted to answer the whole document, we agree with the Reviewer. The
growth in staff costs for Tier-2 is because staff are undersupplied at
present - Tier-2 is grossly understaffed, furthermore the increase in
manpower is modest in comparison to the increase in resources.
It was asserted that the answers, generically, to many of the questions
raised are to be found in the appendices.
It was asked whether we have to respond to all questions? Answers to the
majority of them are detailed within the context document, also issues of
training are to be found there.
r. resources
------------
Again re-visit Tier-2 comments (above).
s. overall scientific assessment
--------------------------------
The proposal is fundable and should be funded. A quick decision is
required.
It was agreed that the points for the meeting should be summarised in
slides for back-up purposes. TD would prepare these.
A three-quarter of an hour presentation was expected, with time for
questions. It was understood that we need to provide some background and
context. The next PMB, to take place on Monday 11th September at 1.00 pm,
would review the meeting on Wednesday 6th.
The final slides inc. prepared answers to questions are available at:
http://www.gridpp.ac.uk/talks/GridPP3_PPRP6Sep06_final.ppt
It was agreed that DN's new title would be: 'Associate User Board Chair'
and DN's appointment would last until December of 2006.
STANDING ITEMS
==============
SI-1 News Items and Meeting Dates
----------------------------------
SP reported that we had received some coverage in Science Grid This Week.
This week's main story is one of ours: http://www.interactions.org/sgtw/
And the Google Earth monitor was image of the week last week:
http://www.interactions.org/sgtw/2006/0823/
SI-2 Production Manager's Report
--------------------------------
JC reported the following items:
1) The T2 involvement in CASTOR testing showed some reasonable rates(best
rates were RAL PPD > 400 Mb/s, Edinburgh 200 Mb/s, Birmingham >400
Mb/s). We did not attempt simultaneous transfers due to the need to
progress other transfer work. A summary of the results can be found
here:
http://www.gridpp.ac.uk/wiki/RAL_Tier1_CASTOR_SRM_tests_T1toT2#Results
2) Last week the preparation for the next phase of site transfer tests was
completed with each reference (RAL PPD, Edinburgh and Birmingham) site
exchanging at rates between 310 and 380 Mb/s. Involvement of the
individual sites started over the weekend. Generally sites have been
responsive but a few have yet to supply a named individual to run the
tests. Results are here:
http://www.gridpp.ac.uk/wiki/SC4_24hr_Individual_Read-Write_Tests
but the summary transfer figures are wrong due to an incorrect
calculation in the script (this is now corrected for transfers since
Friday).
3) There is a two day WLCG Grid Deployment Board this week at BNL:
http://agenda.cern.ch/fullAgenda.php?ida=a057709.
The focus is security (policy and operational) and storage (accounting
and implementations).
4) At the last DTEAM meeting there was renewed focus on security and the
desire to work more closely with the new NGS/RAL eScience security
officer. The lack of a GridPP security officer was noted again as a
problem. Following various discussions the job description for the
GridPP security officer is being revised with the intention of
advertising the position within the next 3 weeks.
5) ATLAS DDM transfers will resume this week starting with T0-T1 transfers
of dedicated (permanent) datasets followed by T1-T2 replications. All
subscriptions will now be done by experts at CERN with issues on the
site side being dealt with by a few ATLAS people in the UK plus the
deployment team and site admins.
6) 600 new disks will be arriving at RAL tomorrow for the hardware swap
out. They will be installed from next week. This will be followed by
approximately 4 weeks of testing. Meanwhile the next hardware procurement
for disk has been finalized.
SI-3 LCG Management Board Report
--------------------------------
See https://twiki.cern.ch/twiki/bin/view/LCG/MbMeetingsMinutes
There was nothing to report.
SI-4 Documentation Officer's Report
-----------------------------------
SB reported that the UIG had a meeting last week and will have another one
next week, the aim is still to have a demonstration web site ready for the
EGEE conference (which will have a parallel UIG session on the Wednesday
afternoon under the heading of NA2/3/4). The initial use cases are being
written by Andrea Sciaba, John White and myself.
Action 220.3 is still ongoing and SB would try to have something for next week.
ACTIONS AS AT 04.09.06
======================
220.3 SB to report on the Workload Management System Documentation.
225.1 TD to email the Tier-1 Board regarding DN's extension; also to talk
with DK regarding Sept meeting agenda.
225.3 DN to provide summary spreadsheet of T1-T2 networking based on
GridPP3 figures; the Tier-1 and Tier-2 future resources planning remains
an issue until the Tier-1/A Board meeting on 22 September.
226.1 TD to raise the problems of LHC experiments' disk space at the
Tier-1 Board.
227.1 TD to contact Andy Lawrence and Nic Walton regarding AstroGrid.
|