Dear All,
Please find attached the GridPP Project Management Board Meeting minutes
for the 396th meeting.
The latest minutes can be found each week in:
http://www.gridpp.ac.uk/php/pmb/minutes.php?latest
as well as being listed with other minutes at:
http://www.gridpp.ac.uk/php/pmb/minutes.php
Cheers, Dave.
--
________________________________________________________________________
Prof. David Britton GridPP Project Leader
Rm 480, Kelvin Building Telephone: +44 141 330 5454
Dept of Physics and Astronomy Telefax: +44-141-330 5881
University of Glasgow EMail: [log in to unmask]
G12 8QQ, UK
________________________________________________________________________
GridPP PMB Minutes 396 (09.08.10)
=================================
Present: Dave Britton (Chair), Tony Doyle, Dave Colling, Sarah Pearce, Roger Jones, Dave Kelsey,
Jeremy Coles, Glenn Patrick, Andrew Sansum
(Suzanne Scott - Minutes)
Apologies: Steve Lloyd, Tony Cass, Robin Middleton, John Gordon, Pete Clarke, Neil Geddes
1. ALICE slots at RAL
======================
AS had issued a summary. The team had proposed to implement a limit on ALICE jobs which
could be effected in various ways - they needed to increase occupany and leave headroom. DB
advised that the algorithm might depend on the characteristics of the job. DC asked if there was
scope for giving them a shorter time slot that was just above their quota? AS advised that they
had to ensure that the farm was responsive to ongoing demands, therefore allocating a certain
number of slots for different job lengths meant they could balance the workload. DB suggested
allowing ALICE to fill up to 25% then incrementally submitting jobs, but to stagger them up to 90-
95% of the farm. This was agreed in principle - a balance had to be maintained that meant a VO
didn't block the farm, but also that it didn't sit idle. The Tier-1 Production Team would progress
the matter based on the guidance that we needed to find a mechanism to run the farm more fully
with low priority work (when high priority was not present) and yet remain responsive to rising
high priority workload.
2. Minor issues
================
- digital curation study: an email had been received and TD had also circulated a link. SP advised
that this issue was something that particle physics was looking at. DK thought they were asking
interesting questions and it would be good to be part of it, but he was unsure what it would
actually mean in practise. It was agreed that SP would respond positively but would ask what
exactly was expected of us - more detail was required. TD commented that the curation people
would probably have to work with RAL as an activity. SP would advise on the response, once
received.
ACTION
396.1 SP to contact the digital curation people, respond positively to their enquiry, but ask what
exactly was expected of us - more detail was required. SP to report-back to the PMB.
- AHM: SP asked whether anyone from GridPP would be attending? Three places were available,
and there would be a joint stand with NGS. It was noted that the AHM clashed with the EGI
Technical Forum. SP would email ukhepgrid and check if anyone was going.
- GridPP26 Sheffield: DB asked if there was any problem with the proposed dates: 18-21 April
2011? There was no change. DB reported that work had commenced in relation to booking and
arrangements.
3. GridPP4 Announcement
========================
DB reported that a letter had been received from Tony Medland of STFC. The outcome was
relatively good in the context of the larger picture of cuts which were currently being imposed
across the UK. SP had done an analysis of the outcome and had circulated this. DB advised that he
had contacted Tony Medland for clarification of some of the details. DB had asked whether we
could move Capital to Resource but the reply was 'no'. DB asked for comments from the PMB on
the GridPP4 outcome. DK commented that a late announcement from STFC would cause issues for
staff contracts. Neither DK nor AS had heard anything official from within STFC. DB noted that we
needed to publicise the result of this very quickly now, and he would ask Tony Medland if any
other information would be forthcoming before January 2011. A few points were raised by the
PMB which DB would discuss with TM. Staff posts were discussed. TD pointed out that the
Collaboration Board should have sight of the outcome and have the timeline which defined
decisions, preferably before GridPP25 at Ambleside. DB noted that he would have preferred to
have had a F2F meeting about this prior to informing the CB, but there was no time, and he did
indeed need to make an announcement at Ambleside. DB would speak to SL and would draft a
covering email along with the announcement, which would be circulated to the CB. He would
hopefully do that this week. DB thought that in the context of the general funding backdrop, the
outcome was relatively good, and he congratulated everyone on their hard work and inputs over
the past months. TD noted SLA uncertainty and that this still had to be dealt with. SP would do a
new spreadsheet with the new costings, and this would be provided to TM in order to tidy up
some of the detail. TM had advised that the costings might change again, if the overhead rate
changed. TD also noted separate pressure on all universities to reduce fEC. DB noted that if fEC
were to be reduced, this might have an effect on some of the posts in terms of hiring dates etc - he
would speak to TM about this as well. DK advised that in addition, there was also a pay freeze in
place over the next two years, and inflation had reduced. TD noted inflation down from 3% to
1.5% annually within universities, which might enable a re-costing of posts.
ACTION
396.2 DB to speak to SL and would draft a covering email along with the announcement, which
would be circulated to the CB. He would hopefully do that this week.
396.3 SP to do a new spreadsheet with the new costings, and this would be provided to TM in
order to tidy up some of the detail.
396.4 ALL: to think through the announcement and come back to DB with any issues/proposals
as soon as possible.
STANDING ITEMS
==============
SI-1 Tier-1 Manager's Report
-----------------------------
AS reported as follows:
Fabric:
1) FY09 procurements:
- Second lot of FY09 disk servers failed our acceptance test. Moving through our acceptance test
with no problems. Expected to complete sucessfully next week.
2) FY10 procurements
- Responses to disk tender being evaluated. Evaluation due to be completed today but
overrunning. Delay caused by some suppliers being unable to complete(new) disk server stress
test on schedule.
- CPU PQQ is stage is closed. ITT has been issued.
3) Working with ATLAS on testing ATLAS software server solution.
4) Disk server crashes are becoming a concern. Recent incident on an ATLAS disk server (1st
August) has caused ATLAS considerable inconvenience and has led to us being blacklisted while
the issue is addressed. Possible firmware problem?
Will provide a summary of the general situation at the GRIDPP25 meeting.
Service:
1) Summary of operational issues is at:
https://www.gridpp.ac.uk/wiki/Tier1_Operations_Report_2010-08-04
2) VO testing of CASTOR 2.1.9 underway. One problem encountered, checksum support does not
work with "external" version of GRIDFTP we run. Only supported on newer "internal" GRIDFTP
which we have not certified in our own testing. We are considering our response to this but are
likely to recommence our own certification using the internal version. Overall - progress towards
CASTOR 2.1.9 is still good.
3) The SL4 batch service has ended.
SI-2 ATLAS weekly review & plans
---------------------------------
RJ reported that things were quiet at present, but there had been issues at the Tier-1 including
disk reliability, and issues at the Tier-2 re database distribution. AS advised that RAL could
possibly help with the latter, and he would give feedback to Alastair.
SI-3 CMS weekly review & plans
-------------------------------
DC was currently on leave but he reported that things were fairly quiet - the new CASTOR testing
and xrootd was going well. At the Tier-2, Brunel had storage problems; Bristol were trying to get
support from Southgrid to run CMS services - it was likely they would become a Tier-3 eventually.
SI-4 LHCb weekly review & plans
--------------------------------
GP reported that things were quiet, 2 disk servers had recently failed, an email had been
circulated.
SI-5 Production Manager's Report
---------------------------------
JC reported as follows:
1) The GridPP Nagios service at Oxford was unavailable over much of
the weekend due to network problems. The outage may affect UK
availability figures and was visible in GridMap. For the period of the
outage there were no alarms in the regional dashboard. An incident
report has been requested and we will look again at failover options
(a request was already with the developers).
2) There is to be a ticketing of sites that fail to meet the EGI
availability/reliability targets for any given month. The site is
expected to reply to the ticket with an explanation of what problems
were faced and the current situation. This approach replaces the query
coming to the regional managers and being followed up via a wiki page.
3) At the end of July the site UKI-LT2-UCL-CENTRAL was set to
“closed” in the GOCDB. The WNs of the site are now available under UKI-
LT2-UCL-HEP.
SI-6 LCG Management Board Report
---------------------------------
It was reported that there had been no MB - the next one was due to take place on 24th August,
then on 7th September.
SI-7 Dissemination Report
--------------------------
SP reported that Neasan O'Neil had posted a news item on theory work.
REVIEW OF ACTIONS
=================
366.8 AS to confirm that the Tier-1 proposes to use Tape-
based storage in the period 2011 - 2015. Ongoing.
384.1 AS to provide a plan for how to deal with the ADS
Service, and bring back to the PMB. AS would circulate a summary. Done, item closed.
384.6 TD/JC to take the lead on the 'GridPP to NGI' document that addresses the forward-moving
technical and other issues from a GridPP perspective - a skeleton outline would be circulated by JC
based on the NGS paper.
PG noted this was on the Agenda of the d-Team meeting for comment. PG to send any inputs to JC
& TD. Ongoing.
394.1 ALL: concrete suggestions for topics were required re virtualisation, multicores etc (to
DB). This session needs to dovetail with LHC Data Management, which is a separate issue. Done,
item closed.
394.2 SP to contact STFC and invite a speaker to come and give a talk for the KE/EI session. Done,
item closed.
395.1 SP to put 'CERN@School' on the PMB Agenda for Ambleside. Done, item closed.
ACTIONS AS AT 09.08.10
======================
366.8 AS to confirm that the Tier-1 proposes to use Tape-
based storage in the period 2011 - 2015. AS had costings and would provide a summary.
384.6 TD/JC to take the lead on the 'GridPP to NGI' document that addresses the forward-moving
technical and other issues from a GridPP perspective - a skeleton outline to be circulated by JC
based on the NGS paper.
396.1 SP to contact the digital curation people, respond positively to their enquiry, but ask what
exactly was expected of us - more detail was required. SP to report-back to the PMB.
396.2 DB to speak to SL and would draft a covering email along with the announcement, which
would be circulated to the CB. He would hopefully do that this week.
396.3 SP to do a new spreadsheet with the new costings, and this would be provided to TM in
order to tidy up some of the detail.
396.4 ALL: to think through the announcement and come back to DB with any issues/proposals
as soon as possible.
INACTIVE CATEGORY
=================
359.4 JC to follow up dTeam actions from the DB, as follows:
----------------------------
05.02 JC/dTeam to try and sort out CPU shares and priority
resources, at Glasgow first (perhaps by raising the job priority in Panda).
----------------------------
The next PMB would be a F2F meeting, taking place at GridPP25, Ambleside, on Monday 23rd
August. Further meetings were as follows:
30 Aug - cancelled (post F2F)
23 Aug - PMB F2F @ Ambleside
16 Aug - cancelled (pre F2F)
|