Dear All,
Please find attached the weekly GridPP Project Management
Board Meeting minutes. The latest minutes can be found each week in:
http://www.gridpp.ac.uk/php/pmb/minutes.php?latest
as well as being listed with other minutes at:
http://www.gridpp.ac.uk/php/pmb/minutes.php
Cheers, Tony
________________________________________________________________________
Prof. A T Doyle, FInstP FRSE GridPP Project Leader
Rm 478, Kelvin Building Telephone: +44-141-330 5899
Dept of Physics and Astronomy Telefax: +44-141-330 5881
University of Glasgow EMail: [log in to unmask]
G12 8QQ, UK Web: http://ppewww.ph.gla.ac.uk/~doyle
________________________________________________________________________
GridPP PMB Minutes 247 - 19th February 2007
===========================================
Present: Tony Doyle, Sarah Pearce, Roger Jones, Stephen Burke, David Britton,
David Kelsey, Dave Newbold, Steve Lloyd, Tony Cass, Robin Middleton,
John Gordon, Jeremy Coles, Peter Clarke
Apologies: Andrew Sansum, Glenn Patrick, Neil Geddes
Yingqin Zheng was continuing to observe for the Pegasus project.
Prior to commencement of the meeting, TD gave feedback regarding the
formal status of the PPRP recommendations, which had been considered by
the Science Committee. An award has been approved by the Science
Committee, but requires further ratification at the PPARC Council on 7th
March, following which formal letters will be received with regard to
posts. PPARC will provide 'letters of comfort' for GridPP2 continuation
posts if required. A breakdown by work package areas is likely to be
provided in the first instance.
1. GridPP18 Agenda
===================
It was agreed that Day One should incorporate the Experiments and MSN.
SL will give a 10-minute introduction. TD will discuss GridPP status.
DB will discuss GridPP3. JC will provide an overview of the Tier-2
Co-ordinators' inputs.
Other requirements are to incorporate talks from the Experiments including
CMS developments, ATLAS, Ganga, LHCb, UKQCD, BaBar Grid developments,
DZero, ZEUS Grid analysis.
Further items included MSN: RGM-A/networking/metadata and Storage
Management/Storage Accounting/Storage classes. Regarding SRM2 it was
considered that not enough time was available for a session but later on
there was a slot or Storage Management issues and an accounting overview.
Security developments and policies should also be considered as part of
MoU planning for GridPP3.
Day Two would include Glasgow developments and would go through 4 x
Tier-2s and Tier-1. A shorter session would consider LCG and Grid
Operations, this would comprise Grid Security Policy and an LCG deployment
overview.
The final session would comprise overview summaries or a large discussion
session. It was agreed that no more overviews were required and that it
would be preferable to discuss UK site readiness: 'How ready are we for
LHC startup?'.
Panels would be required to include a representative from CERN and have an
experimental as well as a site perspective. It was noted that
Clustervision were sponsoring the event.
Regarding Security and Grid vulnerability work, it was agreed that a talk
would be useful, and this could be added-in to session 5 on Worldwide Grid
Security Policy (from a user point of view).
It was agreed that TD would draft a version of the Agenda and put this
into the GridPP18 space by this evening. SS would circulate a table
showing timings and locations. [both were done following the meeting].
It was noted that the DTeam/User Board meeting arrangements were in place
and that topics for discussion should now be raised.
The programme will end with a talk on the National Grid Infrastructure for
Science by NG.
2. Site Readiness Review
=========================
It was noted that DB had issued v0.2 of the site readiness review
questionnaire, but iterations were still required to ensure it was both
relevant and comprehensive. Some feedback had already been received, but
further comments to DB were welcome. It was hoped to issue the
questionnaire to the Tier-2s by the end of February. TD noted that some
of the questions would be difficult for SouthGrid to answer quickly since
their visits are planned early. JG noted that detailed numbers were not
required, only an overview of state of readiness regarding service and
operations implications etc. It was noted that separation between site
and user has implications. RM noted that a virtual Tier-2 level focus is
required to probe what the Tier-2 role is in that area - this is difficult
to address in the site questionnaire. TD noted that a set of pre-defined
questions would be helpful. TD noted that a question should be added
regarding management interfacing with Tier-2 Board Management and
Technical support. DB noted that timescale for input was today and
tomorrow, it was hoped the questionnaire could be finalised over the next
few days.
3. ATLAS Grid Tests
====================
SL reported that tests had been discussed at the DTeam meeting. The
average performance had increased and work was ongoing. Problems were
being encountered at individual site level. JG noted that a
reality/quality check was required to ascertain what should be stressed.
There was a discussion concerning the workload management service and the
resource broker/SL3 & 4 regarding switching and deploying. JC would
iterate. SL would provide a revised summary or additional comments; a
distilled version to be available by the end of the week that would
incorporate feedback.
4. Use of Grid
===============
RJ had been approached by the Grid Tools for Services Co-ordinator at
ATLAS regarding testing of PANDA in the UK. He had asked for a spec of
what was required. If there were no security issues, could the PMB
approve this use? Were there 'in principle' objections regarding Grid
middleware? There was a question of identity/generic input into the VO
box. There were no 'in principle' objections. TD asked what this would
involve in terms of sites. RJ noted that he had asked for a technical
specification but more information was needed. It was agreed that RJ
would get further information and request this on behalf of the PMB. It
was noted that JSPC were discussing security policy.
STANDING ITEMS
==============
SI-1 Dissemination Officer's Report
------------------------------------
SP reported that abstracts had been received for the EGEE user forum:
around 8 from the UK, of around 180 in total. It was noted that the
submission template was not easily filled in for non-applications areas -
RM would report this back to EGEE. A PPARC Parliament breakfast meeting
regarding the LHC was due to take place and Steve Lloyd would attend for
GridPP. Publicity materials would be taken along for this. SP noted that
she was working with Queen Mary University to invite Alan Sugar to open
the QMUL cluster in May. International Science Grid This Week will carry
a report on the EUGridPMA meeting, and may carry a UK picture: 'Man and
Motherboard'. SP was awaiting information on a new storage accounting
system once this was working. She was also awaiting information on the
CMS event from DN. There is a proposal to give out magic cubes at
Masterclasses - the design will be circulated for comment.
SI-2 Tier-1 Manager's Report
-----------------------------
AS presented the following report in absentia:
Hardware:
1) Supplier One delivery
Supplier is deploying into CASTOR. Some is already in and remainder should
follow shortly (discussions are ongoing with experiments about layout).
Garbage collection problems continue but are no longer a blocking item on
deploying the new hardware.
2) Supplier Two Delivery (I) - this was being prepared for deployment.
3) Supplier Two Delivery (II) - acceptance tests are continuing and should
finish by the 9th March.
4) Tape Purchase - this arrived and is available on demand.
5) Tape drive purchase - delivery is scheduled a little later than
expected - expected at Q Associates 9th March. Arrival at RAL soon
after.
6) Tape drive servers - delivery schedule is uncertain but is expected
early March.
Service:
Replication of the BDII was delayed initially pending on a firewall update
and subsequently a decision was made not to deploy this major change on
Friday. It is now scheduled for Monday morning at which time the existing
BDII will be phased out and two new systems will come online.
In the absence of experts we were unable to meet last week to
discuss the SL4 rollout. However AS has informed CMS that we will
definitely not be able to provide SL4 within 1 month.
Job CPU efficiency for January fell to 64%. This appears to be dominated
by LHCB who suffered 38% efficiency for a large share of total resources.
LHCB believe that this is caused by performance and reliability problems
in RAL's dCache - we are investigating. Testing of the dCache 1.7 upgrade
have been completed sucessfully and this is planned to be deployed ASAP -
it may help resolve this issue (although not specifically addressed in the
revision history).
SI-3 Production Manager's Report
---------------------------------
JC provided the following report:
1) LHCb began submitting jobs again from last Wednesday so utilisation
returned to previous levels (not all work is monitored here
<https://gfe03.hep.ph.ic.ac.uk:4175/cgi-bin/load>).
2) Enabling camont.gridpp.ac.uk and total.vo.gridpp.ac.uk is proving
difficult due to lack of support in the middleware for DNS style VO
names. This is despite the EGEE requirement that new VO names have a
DNS format. In particular current production implementations of YAIM do
not support DNS naming. SB noted that there had been a test version
(3.0.1) of YAIM but that it had yet to make it out of certification -
the impression was that there was a good chance that it would pass the
testing this time. In theory it can be done by hand, but enabling VOs
is difficult because it needs changes in several areas, and as far as
SB knew, there was no longer any explicit documentation for manual
installation, in effect YAIM was the documentation. It was noted that
this will be discussed by the DTeam and that some sites should be
running ok. Talks were taking place with TOTAL tomorrow.
3) There has been an increase in the number of problems seen publishing
accounting records to the APEL database. Tickets are being raised where
we see problems. The storage accounting looks more stable
<http://goc02.grid-support.ac.uk/accountingDisplay/view.php?queryType=storage>
but we need to get every site to check the accuracy of figures being
published.
4) Some sites are no longer filling the weekly ROC reports. It turns out
that this is not helped by the fact that reports are not editable over
the weekend. We will raise the matter within EGEE operations once
again. In reports available this week it seems BDII problems/timeouts
still contribute to a large number of site observed errors. The current
EGEE operations response is to request information on top-level BDII
configurations from across the countries where they run and look at how
they are loaded. It was noted that technical knowledge and policy are
two separate issues. TD noted that we have two servers that can take
the load, but that the default position should be to point to BDII at
RAL. The local HEPsysman should be doing local user installation.
There was a discussion about installation being done by support staff
at CERN.
5) Many sites are unhappy about the prospect of deploying SL4 workarounds
using a tarball distribution or rpms not supported under YAIM. Some
sites intend to wait for a stable gLite version which also runs on
64-bit platforms.
6) Meetings: The Tier-2 board met last Friday via VRVS - see
http://www.gridpp.ac.uk/tier2/Tier2_16-02-07.txt.
SI-4 LCG Management Board Report
---------------------------------
JG noted that sites would be reporting on allocations in the VO. The
league tables are a definitive source of information that can feed the
accounting and the Quarterly Reports. There would be no Management Board
meeting tomorrow.
SI-5 Documentation Officer's Report
------------------------------------
SB noted that with respect to Documentation, the old address for the CERN
wiki now gives a redirection link. SB noted that he will amend any
out-of-date links but that they may exist within the system for some time.
He is currently carrying out a complete review of the GridPP user web
pages to check for broken links and other changes.
REVIEW OF ACTIONS
=================
236.6 GP to summarise and circulate the LHCb model as a basis for discussion.
Ongoing.
245.2 JC to forward information on WLCG meeting to SP. Done. Item closed.
245.4 SP to send out an All Hands reminder to UKHEPGRID. Done. Item closed.
246.1 DB to produce draft of review documents. Done. Item closed.
ACTIONS AS AT 19.02.07
======================
236.6 GP to summarise and circulate the LHCb model as a basis for discussion.
247.1 SL to provide an updated version of ATLAS tests summary by end of week.
247.2 RJ to get further information from ATLAS regarding use of Grid for
testing of PANDA, and report-back.
The next PMB would be on Monday 26th February at 1.00 pm. The meeting
closed at 3.00pm.
|