Dear All,
Please find attached the weekly GridPP Project Management
Board Meeting minutes. The latest minutes can be found each week in:
http://www.gridpp.ac.uk/php/pmb/minutes.php?latest
as well as being listed with other minutes at:
http://www.gridpp.ac.uk/php/pmb/minutes.php
Merry Christmas and a Happy New Year, Tony
________________________________________________________________________
Tony Doyle, GridPP Project Leader Telephone: +44-141-330 5899
Rm 478, Kelvin Building Telefax: +44-141-330 5881
Dept of Physics and Astronomy EMail: [log in to unmask]
University of Glasgow Web: http://ppewww.ph.gla.ac.uk/~doyle
G12 8QQ, UK Video - IP: 194.36.1.33
________________________________________________________________________
GridPP PMB Minutes 240 - 18th December 2006
===========================================
Present: Tony Doyle, Sarah Pearce, Roger Jones, Stephen Burke, David Britton,
David Kelsey, Steve Lloyd, Tony Cass, John Gordon, Jeremy Coles, Peter Clarke,
Glenn Patrick, Andrew Sansum, Suzanne Scott (Minutes)
Apologies: Dave Newbold, Robin Middleton
4. F2F Planning
================
DB noted that if the result for GridPP3 were known, we could start the
process including milestones for 2007 and preparation for the Oversight
Committee. TD reported that until the Science Committee meet in early
January '07 we may not know the GridPP3 outcome. We may also not get an
'in principle' decision before the end of the year. TD asked whether a
10.30 am start for 11.00 am commencement of the meeting on the 18th at RAL
would be agreeable? The meeting would probably end around 5.00 pm. This
was agreed.
5. GridPP18 Planning
=====================
TD noted that the next F2F meeting was the one in March before the
GridPP18 Conference. The F2F would take place on 19th March - 10.30 am
for 11.00 am start. The Conference Agenda for Tuesday 20th and Wednesday
21st was as usual. TD noted that on Thursday 22nd a joint UB/DTeam
meeting was organised for the Access Grid Room.
1. Tier-1 Status Review
========================
AS provided the following report:
1) CPU delivery: Our load test is underway (14 days) with no major issues
found. This equipment is on track to be in production 1st week in
January.
2) Supplier One Disk Servers (delivery 1). Supplier updated the RAID
controller firmware last Tuesday and are now carrying out their 7 day
load test. We expect to commence our load test later this week and
provided all goes according to plan this will lead to this equipment
being available before Friday 19th January. However this does involve
our load tests running over the Christmas period where there is some
risk of disruption.
3) Supplier One Disk servers (delivery 2). This equipment was expected to
be delivered before Christmas, however Supplier was not happy to
deliver it until they were certain the problems with the previous
delivery were resolved. We are now planning on receiving delivery in
the second week of January after which they must run their 7 day load
test followed by our 28 day load test. This equipment is now scheduled
to reach production at the end of February - this is still in time to
meet our conservative schedule agreed with the UB for resource
availability.
4) Supplier Two servers. The problem with the drives being ejected by the
controllers is resolved (as far as we are able to tell). We have had no
drives ejected since load tests started on the 4th December. Although
we introduced a number of changes, it appears that the cause of our
difficulties was a drive firmware bug which has since been corrected by
the Engineers. It appears that the bug would manifest when certain
drive head activity patterns occured and it would probably manifest on
any hardware combination. We are planning on deploying these servers
into production W/B 2nd January.
5) The CASTOR service is functioning and it will be declared a full
production service once Bonny Strong returns from leave early in
January. According to provisional UB planning figures we expecvt to
have spare disk capacity available during the period January-June 2007.
This would be an ideal time to complete the migration from dCache to
CASTOR and therefore I'd like to propose a provisional date for the
termination of dcache of 30th June 2007. This was agreed.
2. Networking across UK
========================
PC talked through the 2nd draft of his proposal for a GridPP Policy on T2
networking connectivity. It was noted that the first three paragraphs
provided background. It was agreed to alter point three to include: " ...
part of an overall plan agreed by UK Experiment Co-ordinators, individual
Tier-2 Management Boards, and the PMB ...". It was noted that point four
should be supervised by JC. TD noted that planning was being done by each
of the Experiment Co-ordinators in the UK and that JC may have to delegate
such monthly supervision. It was noted that point five refers to PC as
well as to Robin Tasker, but the wording should remain unchanged.
DB noted that when writing policy documents, in terms of presentation it
might be preferable to put what 'will' happen first of all then 'what not
to do' afterwards. JG agreed, and noted that this document would be good
for internal discussion but not a formal policy document. PC agreed to
amend the document accordingly - the order of points 1-5 would be changed,
and point 3 amended as above. TD asked where this document should be
placed following amendment? It was agreed that it should appear on the
Tier-2 Board page. PC will re-word and send to SL.
[note: PC sent an updated draft on Wednesday]
3. Planning for 2007
=====================
It was noted that a high-level starting point was required. DN had sent a
response to Jamie Shiers' draft table of events and targets - it was
agreed that more detail was required, but the list of activities provided
was useful. Metrics were also required and our own metrics and milestones
will need to be reviewed in order to reflect any high-level plan. Now
would be a good time to carry out a review into which metrics and
milestones would be most useful for 2007. It was noted that service
challenge plans would need to be integrated into such a review and all
things will be responsive to experiment plans. TD noted the variety of
things that it is possible to measure, but integration of these is
difficult. It was understood that T1 to T0 and T2 communication will be
difficult - there need to be metrics associated with scaling-up to
simultaneous response. DB noted that a review was required but there was
no way to relate what we measure to the plans provided by JS. TD noted
that this has to converge in the New Year - all the existing plans need to
be gathered into one webpage. It would require a volunteer to organise
this. It was noted that there was no co-ordination of the service
challenges' interaction and simultaneous operation of T0 and T2 to T1
therefore tests would need to be scheduled. DB noted that the experiments
would need to be part of this planning. JG suggested that it was the
experiments that should be driving this forward. TD noted that we need to
build-up in a coherent way what needs to be done between now and June
2007. DB noted that LCG and the experiments need to be informed of where
our concerns lie so that they know what to test. TD noted that it was
unlikely that this would converge by the F2F and Tier-1 Board meetings but
we need to know what will happen between January and June '07. This
required three actions on the Experiment Co-ordinators to look at how
things project on the UK - this needs to be built-in to the F2F Agenda in
order to be able to discuss actual documents.
STANDING ITEMS
==============
SI-1 News Items and Meeting Dates
----------------------------------
SP reported that a new version of the GLUE schema draft was now finished
and was being sent to Stephen for comment prior to circulation. This is
likely to be published later this week and will be flagged for
International Science Grid. The text from Marco LaRosa was waiting for a
draft to be done on half-term monitoring; and an item on Christmas might
be forthcoming. It was noted that International Science Grid are running
an item on the Brokering Meeting by Neasan O'Neill.
SI-2 Production Manager's Report
---------------------------------
1) JC reported on the email request from Hannah Cumming of Total UK
who had asked that as GridPP were the main providers to the EGEE
Infrastructure within in the UK she wanted to know if there was a
possibility for Total to use the GridPP infrastructure for a limited
set of tests. This would involve temporary access by way of generic
VO. The PMB agreed that Total could be enabled on the GridPP
infrastructure, but TD noted that the results that they hope to achieve
need to be defined so that achievements are measurable - this is a
useful mechanism for industry. DN noted that a news item and poster
would be good but it was important to have a more accurate idea of how
they intended to use the Grid, the timescale involved, and that the
outcome would provide posters and a news item. It would also be useful
to get feedback from Total on their use and experience of the Grid.
It was asked whether Total could carry out a dedicated test in the New
Year? It was noted that summer '07 would be more difficult in terms of
space. It was agreed that JC would respond positively to Total based
on the provisos discussed.
2) JC noted that most people were going to the January workshop and
would be accommodated in the hostel under the block booking - the
status needs to be confirmed. TC noted that a list had been sent to
Mary Elizabeth but it was likely that duplicate bookings exist. JC
agreed to check the situation and update the list and dates required.
TD asked whether the hostel would contact individuals directly? This
was unlikely, therefore TC would handover any information to JC. JC
would co-ordinate the hostel and respond to individuals.
3) It was noted that several sites have yet to complete the second round
of site-to-site transfers. No additional requests (over that of
Edinburgh-Glasgow case) have been received for dedicated network
connections.
4) It was noted that no significant new problems had been reported at
recent deployment meetings - there were however some issues with Biomed
launching many jobs leading to proxies expiring before jobs could run.
ATLAS has caused problems for Glasgow by each job trying to install the
ATLAS software in the /home - WN disks fill up and later ops tests fail
(some workarounds such as quotas to be implemented). There was clear
evidence that many users do not select the appropriate queue for their
job, leading to many being killed off by the batch systems.
5) JC reported that there is a reluctance to try to tackle the CPU:Disk
ratio issue(currently weighted towards CPU too much) during new
procurements until more use is made of the Tier-2 disk - otherwise
warranties will start to expire just as the disk really gets used. TD
noted that we can account properly for storage once the storage
accounting pages are there for checking. It was noted that plans for
spring and summer '07 will have more disk usage. TD noted that we need
to encourage use of disk at present. The PMB agreed that JC should
remind sites to build-up disk resources in build-up to 2007 requirements.
SI-3 LCG Management Board Report
---------------------------------
It was noted that there was nothing to report.
SI-4 Documentation Officer's Report
------------------------------------
SB noted that the new version of the User Guide should be available in the
New Year.
AOCB
====
TD reported that at the NGS Board Meeting on 13th December, convergence
with GridPP had been discussed. There had been progress on the Glasgow
side, but the NGS contact was ill at present. Technically, Glasgow were
capable of doing what NGS currently does. The NGS Board discussed the
roadmap for the coming year - the resource broker was to be deployed as
part of planning, and VOMS would be integrated into the roadmap planning.
NGS were dealing with OMII and gLite, involving planning with a greater
degree of convergence regarding the future of EGEE. Further meetings were
taking place this Friday (22nd) and 10th January '07.
REVIEW OF ACTIONS
=================
234.1 PC was continuing to inquire on the use of Grid papers in the RAE.
Ongoing.
235.1 It was noted that further text was required from CMS, but
essentially the information was ready to be forwarded to DB for the
Quarterly Reports. Action closed.
236.3 It was noted that funding for a CMS Grid workshop at Bristol in
January '07 had been approved in principal by the PMB - further
information was awaited from DN. Ongoing.
236.4 DN to summarise and circulate the CMS model as a basis for
discussion. Ongoing.
236.5 RJ to summarise and circulate the ATLAS model as a basis for
discussion. Item closed.
236.6 GP to summarise and circulate the LHCb model as a basis for
discussion. Ongoing.
It was noted that these model summaries should be short and succinct, a
one-page mapping of the model onto UK infrastructure re how the model can
be applied to UK infrastructure and what that would involve. It was noted
that RJ circulated information regarding ATLAS during the meeting. GP
would be able to provide info early in the New Year.
237.1 TD, in conjunction with AS and JG, would define the Tier-1 manager
role and add to the PMB page in the New Year. Ongoing.
237.4 TD and SL (with input from DB) to write a 1 page proposal for a site
readiness review in March/April 2007. This was ongoing but a document
would be ready for the F2F meeting and the item would be added to the
Agenda.
237.5 TC had now block booked rooms in the hostel for the WLCG
collaboration workshop. Item closed.
238.2 JC to organise a 'forward-look' at the connectivity of the Tier-2
sites - this was to be seen in the context of PC's document but existing
network information from Tier-2s was required: a succinct summary of
connections should be provided. The 'forward look' should include
information for '07 and into mid '08. TD noted that networking needed to
be sorted out in 2007 and it would therefore be good to know what the
plans are, for example, is an individual site upgrading from 1 to 10 gig
link?
239.1 PC's document on networking had now been generated. Action closed.
ACTIONS AS AT 18.12.06
======================
234.1 PC to inquire on the use of Grid papers in the RAE.
236.3 DN to circulate further information to the PMB regarding funding for
a CMS Grid workshop at Bristol in January '07.
236.4 DN to summarise and circulate the CMS model as a basis for discussion.
236.6 GP to summarise and circulate the LHCb model as a basis for discussion.
237.1 TD, in conjunction with AS and JG, to define the Tier-1 manager role
and add to the PMB page.
237.4 TD and SL (with input from DB) to write a 1 page proposal for a site
readiness review in March/April 2007 - item to be added to the F2F Agenda.
238.2 JC to organise a 'forward-look' at the connectivity of the Tier-2
sites.
240.1 PC to amend draft GridPP Policy Document on T2 networking
connectivity and forward to SL.
240.2 JC to check and co-ordinate the block booking at the hostel, and
contact individuals direct regarding their reservations.
240.3 JC to remind sites to build-up disk resources in order to cope with
spring/summer '07 requirements.
The next PMB meeting would take place on 8th January 2007.
The meeting closed at 2.40 pm.
|