JISCMail - UKHEPGRID Archives

Email discussion lists for the UK Education and Research communities
Subscriber's Corner
Email Lists
UKHEPGRID Archives

UKHEPGRID@JISCMAIL.AC.UK

View:

Message:
[
First
Last
]
By Topic:
[
First
Last
]
By Author:
[
First
Last
]
Font:
Proportional Font
		LISTSERV Archives
		UKHEPGRID Home
		UKHEPGRID 2006
Options

Subscribe or Unsubscribe
Get Password
Subject:
Minutes of the 199th F2F and 200th GridPP PMB meetings
From:
Tony Doyle <[log in to unmask]>
Reply-To:
Tony Doyle <[log in to unmask]>
Date:
Wed, 25 Jan 2006 19:20:30 +0000
Content-Type:
MULTIPART/MIXED
Parts/Attachments:
TEXT/PLAIN (25 lines) , 060123.txt (1 lines) , 060110.txt (1 lines)
Dear All,

     Please find attached the F2F and weekly GridPP Project Management
Board Meeting minutes. The latest minutes can be found each week in:

http://www.gridpp.ac.uk/php/pmb/minutes.php?latest

as well as being listed with other minutes at:

http://www.gridpp.ac.uk/php/pmb/minutes.php

F2F minutes can be found directly at:
http://www.gridpp.ac.uk/pmb/minutes/060110.txt

Cheers, Tony
________________________________________________________________________
Tony Doyle, GridPP Project Leader            Telephone: +44-141-330 5899
Rm 478, Kelvin Building                        Telefax: +44-141-330 5881
Dept of Physics and Astronomy           EMail: [log in to unmask]
University of Glasgow             Web: http://ppewww.ph.gla.ac.uk/~doyle
G12 8QQ, UK                                      Video - IP: 194.36.1.33
________________________________________________________________________




                 GridPP PMB Minutes 200 - 23rd January 2005

                =========================================== 

Present: John Gordon, Sarah Pearce, Tony Doyle, Dave Britton, Robin 

Middleton, Steve Burke, Tony Cass, Steve Lloyd, Roger Jones, Peter Clarke, 

Dave Kelsey, Jeremy Coles



Apologies: Dave Newbold 



1.  Allocation of PMB numbers to documents [TD]

===============================================

It was agreed to change the documentation document from 'User Web Pages' 

for the Oversight Committee to 'Documentation Report' and numbered 67. 

Others were allocated as below:



    * Executive Summary [68] PMB

    * Project Map [69] DB

    * Resource Report [70]DB

    * LCG Report [71] TC

    * EGEE Report [72] RM

    * Deployment Report [73] DK

    * Middleware/Security/Network Report [74] RM

    * Applications Report [75] RJ

    * User Board Report [76] DN

    * Tier-1/A Report inc. Tier-1/A procurement methods [77] JG

    * Tier-2 Report inc. Year 1 outturn [78] SL

    * Dissemination Report [79] SP

    * Documentation Report [67] SB

In addition

    * CERN and Tier-2 Operations [83] TC

    * Performance Monitoring [82] JC

    * (Upper) Middleware Planning [81] RM,RJ

    * Experiment engagement questionnaire (v2) [80] DN

    * Grid for LHC exploitation [for reference]



It was agreed that the ~final documents would need to be circulated in two 

weeks time (Monday 6th February).

Face 2 Face minutes were circulated shortly after the meeting expanding 

upon this.  The overview would need to be completed by 13th February and 

everything would need to be sent before Wednesday 15th February, a week 

before the Oversight Committee.



2. Quarterly Reports [DB]

=========================

All the quarterly report would need to be completed now. The deadline had 

passed.  The status so far was sought. RM had all those for M/S/N. RJ had 

half of these for the applications area. Tier 1 Report was not available 

and it was agreed that a reminder would be sent advising that the 

deadline was past on that they were urgently required. Tier 2 Hardware 

report was also not all in and again a reminder would need to be sent.



3. Decisions from Tier-1/A Board [DK]

=====================================

DK reported on the recent meeting. 



Agenda item 1 - Tape service plans

----------------------------------

The board agreed ...

that the proposed move to Castor is the right approach, while agreeing

that the timescales are tight and that this does give rise to some risk.

we will buy just one T10K drive now for testing (its more cost effective

to delay purchase of drives until they are really needed) 

we will not buy T10k tapes now apart from a small amount for testing and

service challenge needs (as they are likely to be much cheaper next

year) 

we will not purchase any more 9940 drives or tapes 

RAS (with reference to the UB) is authorised to purchase up to 100 TB of

T10K tape. T10K drives needed only for service challenges will be

borrowed. Up to 2 additional T10k drives in the second half of FY 06/07

(if these are needed by the experiments) would be approved following

review by Tier-1 Board at the October meeting.

The ongoing maintenance and operations costs of the DataStore after the

end of GridPP2 will be covered as part of the bid for the Tier 1/A

service to PPARC 

The UB will in future consider tape bandwidth requirements as well as

capacity 

Agenda item 2 - 2005 Outturn

----------------------------

The board was happy to see that the job slots had been very nearly full

since August 2005 and agreed that there was little room for improvement

here 

Agenda item 3 - Tier 1/A Requirements, Planning, Allocations and MoU's

----------------------------------------------------------------------

The board discussed the various issues at length.

There is very little flexibility because: Disks are more expensive this

year than last year's planning figures (foreseen price drops from the

availability of higher capacity drives have not arrived soon enough).

CPU is more expensive than planned. There was uncertainty of the

performance of dual core processors. (Note: New information obtained

after the meeting has resulted in the use of a factor of 2 for AMD

Opteron processor performance, dual vs single core, for planning

purposes.) The FY 05/06 purchase was delayed by PPARC following the July

05 OC meeting, meaning that the increased disk capacity will not be

deployed until Q3 2006. 

Planning in April 2005 concluded that there would be a severe lack of

resources definitely in 2008 and probably in 2007. The problems noted

above have brought this crisis forward to 2006. There are just not

enough funds to meet all of the requirements. Difficult decisions will

have to be made both by the running experiments and LHC. 

Given this very difficult situation, the board agreed... 

BaBar should manage within its current disk allocation of 95 TB for the

first two quarters of 2006. Others will therefore be squeezed by ~20 TB

compared to the current UB allocations. The UB will have to reconsider

the LHC and other experiments disk allocations for these two quarters. 

No decisions are made at this point regarding allocations for Q3 and Q4

of 2006. These will need to be looked at again following agreement of

the purchasing plan for the next purchase (early in FY 06/07) 

The board also noted the large amount of disk available and foreseen at

the Tier 2 centres and encourages all experiments to make use of these. 

AOCB

----

The 2006 figures are urgently needed for the signing of the LCG MoU.

PPARC have delayed signing until after this board meeting. 

DB will update (and circulate) the planning figures for 2006 on the

basis of the latest information within the next few days.



4. Year 1 Outturn Spreadsheet Development [SL]

=============================================

SL had prepared spreadsheets on the GridPP Resources used by LCG in 2005.

The data included Tier-2 and Tier-1 CPU Use by all Experiments adopting LCG.



It was noted that:

The available KSI2K are taken from the Tier-2 Quarterly Reports 

(LCG Resources);  

The used KSI2K Hours are taken from the GOC (GridPP Accounting);  

Use at Cambridge is not recorded because they use Condor not PBS; 

Disk Data are taken from the GridPP Disk status webpage; 

The allocated KSI2K are taken from the MoU numbers.



SL expressed doubts about the efficiency of the Accounting systems 

currently used. They wouldn't give absolute numbers but would indicate 

the trend over the year.



NEW ACTION 200.1: JC to summarise the accounting problems.

It was suggested that he may start with discussions with Dave Kant and 

Greig Cowan to enquire if they knew of the sources of problems.

 

5. Additional GridPP meeting in November 2006 [PC]

=================================================

After discussions it was agreed that a meeting would be held in late 

October, the dates being looked into were 30th and 31st October and 1st 

November 2006. It was to be held at NeSC. PC would start looking into 

room availability.

Note: dates now fixed at

http://www.gridpp.ac.uk/meetings/ 

17th GridPP Collaboration Meeting, NeSC, 1-2 November 2006

(with PMB meeting prior to this)



6. EGO Questionnaire [RM]

=========================    

NG is collating a UK/I response to the questions posed by EGEE concerning 

what follows EGEE phase 2. At present this goes under the heading of a 

European Grid Organisation - EGO. There is an EGEE PMB workshop on EGO 

next week near CERN and the inputs from the various federations will be 

summarised there.

 

GridPP inputs to this should be sent to NG and RM before noon tomorrow.

                                                 

7. New Travel Guidelines and Forms [RM]

=======================================

There is a new travel procedure document which is being adhered to.

See

http://www.particlephysics.ac.uk/research/travel-and-claim-forms.html

for the new forms.



STANDING ITEMS

==============

S1-1 ALL to provide any news items and or dates confirmed or provisional of 

conferences and or meetings to SP.



DK noted that the next Tier 1/A meeting was scheduled for 11th May 2006.

A PMB Face 2 Face was proposed for 12th May (or, possibly, 10th May).



There should be a press release shortly on Service Challenge 3. 

It was hoped that the signing of LCG MoU could also be a news item.  

 

S1-2 Production Manager's Weekly Report of Issues

JC reported that:

1) The T0-T1 throughput tests have been ongoing for the last week. RAL

has successfully participated. Our rate has been Fair and typically

averaged around 100 MB/s. The target of 150 MB/s has been difficult to

achieve with bandwidth capacity being limited at CERN. As a result this

week there will be a further test to gain an indication of individual

institute peak sustained rates. Later in the week the use of srm-copy

will be possible and may result in an improvement of throughput rates.

2) The weekly EGEE report which can now take input from sysadmins now

has a more reasonable editing period and for this weeks report the

response was good. Sites which responded were:  Grid Ireland (for all

sites!), LeSC, Bristol, Glasgow, Edinburgh, Brunel, Durham, Liverpool,

Oxford, RHUL, RAL, UCL-CCC, Lancaster, IC and QMUL. No information was

(or has in the past been) entered for the following sites which had

problems during the period: UCL-HEP, Sheffield and Cambridge. Birmingham

usually responds but not this week and Manchester was down for

maintenance. 

3) The main focus of the deployment team last week was on the testing of

a pre-release of 2.7.0. Some minor problems were found and fed back, but

there were no major concerns. The hope is for a 2.7.0 release next week.

The requested upgrade period will be 3 weeks as before.

4) Birmingham has been leading our efforts to be a stable part of the

Pre-production Service thanks to the efforts of Yves Coppens. Imperial

is also now joining. GLite 1.5 was released at the weekend.

5) The Helpdesk upgrade mentioned at GridPP15 has gone ahead today. This

will enable more automated ticket exchanges with GGUS and other ROCs,

and also the closing of ticket copies in other helpdesks.

6) The deployment team will send one person to the forthcoming Ticket

Process Management course at CERN at the beginning of February. As a ROC

it is expected that we take part in this EGEE wide activity.

7) Internal GridPP transfer tests have been ongoing. More sites are now

hitting the target rate of 300 Mb/s but we have had to limit tests with

the Tier-1 due to the SC3 throughput test reruns. Generally sites have

been good at getting involved. The latest results and scheduled tests

can be seen in the Wiki:

http://wiki.gridpp.ac.uk/wiki/Service_Challenge_Transfer_Tests

S1 -3	Management Board Report of Issues

JG to report next week

S1 - 4	Documentation Officer's Weekly Report of Issues

SB noted no issues this week.



It was agreed that if Sarah and Dave could log on VRVS.org as a test and 

it worked then TD would set up for next week's meeting.



Next meeting 30th January at 1.00pm



REVIEW OF ACTIONS

=================

184.1: TC to write document "How CERN helps the small sites to install

and manage the LCG software".

- ongoing



184.3: RM and RJ to document Gap Analysis

- ongoing



184.4: DN to document UB questionnaire issues

- ongoing



187.1: JG to prepare combined actions list from GridPP14 meeting

- done



197.1: SL to determine realistic estimate of July 2006 hardware for T2s.

- ongoing



197.2: SL to review TC's document on "How CERN helps the small sites to 

install and manage the LCG software"

- ongoing



ACTIONS AS AT 23RD JANUARY 2006

===============================

184.1: TC to write document "How CERN helps the small sites to install

and manage the LCG software".



184.3: RM and RJ to document Gap Analysis



184.4: DN to document UB questionnaire issues



197.1: SL to determine realistic estimate of July 2006 hardware for T2s.



197.2: SL to review TC's document on "How CERN helps the small sites to 

install and manage the LCG software"



200.1: JC to summarise the accounting problems.





            GridPP PMB Minutes 199 - 10th January 2006

            ==========================================

                  Face to Face Meeting at RAL

                  ----------------------------



Present: Tony Cass, Pete Clarke, Dave Newbold, John Gordon, Roger Jones, 

Robin Middleton, Tony Doyle, Dave Kelsey, Stephen Burke, Neil Geddes, 

Dave Britton, Jeremy Coles, Steve Lloyd. By Gizmo/Phone: Sarah Pearce. 

Apologies:  Deborah Miller.



Experiment's Hardware Requirements

==================================

[OC Action 1 - GridPP to go back to the experiments to confirm their 

requirements before the next tender exercise. (minute 5.7)]



DN reported that the issue was the lack of Tier-1 Disk Resources and the 

usage of Tier-2 Disk Resources. DB noted that we had been back to the 

experiments and the requirement is now 50% less. DN explained that the 

new UB numbers are a pragmatic response to the shortfall and not a change 

in the requirements. However, we cannot meet the MoU commitments to LHC 

or BaBar due to lack of resources. Should we bring spend on disk forward? 

DN said his personal opinion was no because we will always have this 

problem so bang per buck is more important. We have now gone from under 

use to overuse. We should concentrate on 2007. We still need to go back 

to the experiments for their strategic requirements.



On the issue of Tier-2 disks the feeling from the experiments is that 

they are not yet reliable enough to use. We need to demonstrate that they 

are reliable enough. They do not need tape backup. It needs the Tier-2s 

to agree to provide more robust disk storage and the experiments to 

provide more detailed requirements. There was agreement that Durable 

means long term but not for ever. However this can't be implemented at 

the moment. Is there a practical way of using Tier-2 disk in the UK? This 

is put to the Deployment Board how do we use Tier-2 disk? Probably have 

to start with particular sites per experiment (H1 have successfully done 

this and are now expanding). It was agreed to start with ATLAS, CMS and 

PhenoGrid at Lancaster, Edinburgh and Imperial, gain experience and 

confidence and then expand.



Action 199.1: JC to raise Tier-2 disk usage at Deployment Board.



Oversubscription of Resources

=============================

[OC Action 5 - GridPP to consider introducing a suitable level of over 

subscription of GridPP resources (minute 6.1)]



DN explained that CPU is over allocated in the sense of what the 

experiments are told they can have but is rather meaningless on a Grid. 

Over allocation can't be done for disk and experiments don't want it. 

Historically experiments asked for a lot of disk and didnt use it. At 

Tier-1 there are sufficient disk servers to allocate each server to a 

single experiment and it is a lot easier not to share across experiments, 

which means there is only one set of people to negotiate with if it is 

lost. It is not so obvious how to do this at Tier-2.



Experiment Engagement Questionnaire Plans

=========================================

[OC Action 4 - GridPP to provide an update of the Experiment Engagement 

Questionnaire for the next meeting (minute 5.19)]



[OC Action 10 - GridPP to review the information gathered by the 

Experiment Engagement Questionnaire and consider the actions required to 

make the outcome of future questionnaires more positive (minute 6.6)]



DN explained that there have been two such questionnaires, at the start 

of 2005 and the middle of 2005 with similar results They show a clear 

lack of engagement. DN has gone round after the UB talking to individuals 

to see how things have gone. For the small experiments (not H1/ZEUS) 

there is no change. These should probably be portal users. For H1/ZEUS 

there is a much better uptake. They are major users of Tier-1 but are not 

using data management tools which needs to be investigated. Mainstream 

BaBar are not engaged with the Grid. Resources have been allocated but 

there is not much progress. For LHC things have incrementally increased 

and are improving. Documentation is being addressed. Data management is 

the issue. Workload management is OK. Addressing other issues: 

documentation and Tier-1 contacts are being addressed and improving but 

we are not seeing the full fruits yet.



There was a discussion of whether to continue with questionnaires which 

are not viewed so useful as talking to people. Some of the issues have 

been addressed, some not e.g. portals for small experiments. They have no 

manpower to develop them. UKQCD are using their own tools which works 

well but is not transferable. We can make a list of actions for the small 

experiments but it needs manpower that no-one has. The real problem is 

BaBar. What can we do about BaBar not using the Grid? BaBar have been 

asked what they will do with Grid in the coming year. For LHC one can 

define a list of actions but not clear how to support them e.g. a request 

for more experiment support at the Tier-1. The contacts list didnt happen 

although the people are defined and being effective but overloaded. We 

cannot operate the Tier-1 in 2007 with this level of manpower. There was 

a discussion of middleware versus experiment support. There will be a 

list of issues in a couple of weeks.



Gap Analysis

============

[OC Action 9 - GridPP to undertake a gap analysis of baseline services 

needed by the experiments (minute 6.5)]



RJ/RM reported. A document exists. An issue is whether there is going to 

be any fallout from this analysis? The answer is probably yes. We are 

putting our faith in the LCG and experiments VO boxes. This may not 

satisfy committee. More effort is needed in upper middleware and 

operations for the experiments. If there are gaps what can GridPP do 

about it? We need to put pressure on someone to fill gap. VO boxes 

currently provide ad hoc solutions. These are pro temp solutions and 

should become part of middleware stack. There was a discussion of data 

management scenarios. Whats the gap and how are we going to fill it? Much 

is being done at CERN. What does GridPP do if there is no solution? It is 

clearly not just a UK problem. The POB recognises the problems. Service 

challenges are addressing some of the issues although engagement of the 

experiments in the UK is maybe not high enough (it was good to start 

with). Sites have been well engaged. There are problems when milestones 

slip. Service challenges dont really address the gaps. The RJ/RM document 

answers some of the questions about high level data management services. 

We should recognise some things arent going to be provided and 

experiments will have to do them. This is no problem at end of the day as 

experiments will do it but it duplicates effort. There are not thought to 

be any large gaps here (showstoppers) for the major clients but some 

things need more effort and have to be developed by each experiment. Once 

again it was asked why don't BaBar use them and the conclusion was that 

they don't really need them.



Upper Middleware Planning

=========================

This agenda item referred to the "additional" OC document in the proposed 

list of documents but it was agreed that this is the Gap analysis 

document discussed earlier. There was a discussion of the forthcoming 

Rolling Grant and Tier-1/2 call. We need to flag to the OC that PPARC 

needs to define this soon.



Value Added

===========

[OC Action 11 - GridPP to identify the top 5/6 added value items that 

GridPP had delivered (minute 6.7)]



DB reported that we had 20 plus items at Birmingham and it was agreed 

that we need to pick 5/6 highlights. GridPP has added value as opposed to 

giving money piecemeal to experiments. TC summarised it as follows: We 

created a strong GridPP identity (picking up PPARC e-Science lead) which, 

together with founding contribution to LCG Project at CERN produced clear 

UK leadership in critical middleware areas, notably Grid security (and 

Information & Monitoring systems). Within the UK, the strong GridPP 

identity led to well organised and coordinated Tier-1/Tier-2 structure 

and thus the UK's largest Grid which is emphasising UK's contribution to 

computing for LHC experiments (e.g. plot with UK contribution to LHCb & 

CMS MC production) and is open to other sciences in the UK."



Tier-1 Issues ============= TD reported that we hope to know in the next 

day or so the current procurement costs of disk and CPU to meet the 

minimum ~200KSI2K and with the rest spent on disk. BaBar have been 

requested to make case Tier-1/a Board as to how to meet their 

requirements of 90TB by a realistic date. Can they use Tape at Tier-1 or 

disk at Tier-2. If they cannot get more Tier-1 disk, their plan is to 

migrate data currently stored here to Italy and this might have physics 

impact as it is used predominately. The estimated cost of 90TB is 150k. 

It was asked whether the PMB would sign off 150k for BaBar's use if it 

were part of plan to use Grid. Otherwise the money would be spent in 

2007. The BaBar MoU doesnt specify when in the year it is delivered but 

BaBar UK want to specify. TD said there is 1.5mpounds in the 2007 budget. 

This represents movement forward of 10%. The 90TB now would be 250TB in 

2007 which is about the same as CMSs allocation. There was a loooong 

discussion. The conclusion was that the PMB did not wish to buy 90TB of 

disk for BaBar now but if they had to then it should come out of BaBar's 

2007 allocation. TD would report this input to the Tier-1/A Board.



Dissemination Items

===================

SP reported. She has been commenting on the draft strategy from Mike 

Greens LHC Promotion Strategy Group and offering GridPP help. We were 

turned down again by the Royal Society, this time jointly with AstroGrid. 

The LHC Group has also submitted a bid and we may be able to piggyback on 

that if they are successful. SP has been preparing for Mumbai i.e. 

screens and a stand. RM asked which flags we are flying, SP said all. She 

is trying to find out if there is space for posters, flyers etc. SP said 

it was time to kick start the GridPP Brochure project. TD said this is 

tied to the added value discussed earlier. SP reported that the new 

Events Officer, Neasan ONeill started on Monday. He will first look at 

introductory stuff on the website. and speak to Fergus and QM designers 

about magic cubes. He is also taking pictures. SP suggested it was time 

to change the posters template maybe for the IoP meeting in April. This 

was agreed by the PMB. Neason needs to talk to designers at QM. RM was 

volunteered to provide a GridPP15 news item while out of the room. SL 

asked about the Birmingham schools project. SP said this is not yet set 

up but Pete Watkins will keep us informed.



OC Document Preparation

=======================

TD will allocate PMB numbers. The OC need the documents a week before the 

meeting (22 February). Hence we need input by 6 February to sign off by 

13 February. SL reported that he updated the web pages but not yet with 

all the additional documents.



The proposed documents (as of 8 December 2005) are:



    * Executive Summary [PMB]

    * Project Map [DB]  added value items

    * Resource Report [DB]

    * LCG Report [TC]

    * EGEE Report [RM]

    * Deployment Report [DK] 

    * Middleware/Security/Network Report [RM]

    * Applications Report [RJ]

    * User Board Report [DN]

    * Tier-1/A Report inc. Tier-1/A procurement methods [JG]

    * Tier-2 Report [SL] inc. Year 1 outturn

    * Dissemination Report [SP] 

    * Documentation Report [SB]



In addition



    * CERN and Tier-2 Operations [TC]

    * Performance Monitoring [JC]

    * (Upper) Middleware Planning [RM, RJ]

    * Experiment engagement questionnaire (v2) [DN] 

    * Grid for LHC exploitation [for reference]



ACTIONS AS AT 10TH JANUARY 2006

===============================

184.1: TC to write document "How CERN helps the small sites to install 

and manage the LCG software".

- ongoing



184.3: RM and RJ to document Gap Analysis

- ongoing



184.4: DN to document UB questionnaire issues

- ongoing



187.1: JG to prepare combined actions list from GridPP14 meeting

- ongoing



197.1: SL to determine realistic estimate of July 2006 hardware for T2s.

- ongoing



197.2: SL to review TC's document on "How CERN helps the small sites to 

install and manage the LCG software"

- ongoing



Action 199.1: JC to raise Tier-2 disk usage at Deployment Board.
Top of Message | Previous Page | Permalink
JiscMail Tools

Files Area | help
RSS Feeds and Sharing

Search Archives

Advanced Options