JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for UKHEPGRID Archives


UKHEPGRID Archives

UKHEPGRID Archives


UKHEPGRID@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

UKHEPGRID Home

UKHEPGRID Home

UKHEPGRID  February 2011

UKHEPGRID February 2011

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Minutes of the 415th GridPP PMB meeting

From:

David Britton <[log in to unmask]>

Reply-To:

David Britton <[log in to unmask]>

Date:

Mon, 21 Feb 2011 11:44:33 +0000

Content-Type:

multipart/mixed

Parts/Attachments:

Parts/Attachments

text/plain (78 lines) , 110214.txt (376 lines)

Dear All,

Please remember to register for GridPP26 using the following link:

http://www.gridpp.ac.uk/gridpp26/index.html

Please find attached the GridPP Project Management Board Meeting minutes
for the 415th meeting.

   The latest minutes can be found each week in:

http://www.gridpp.ac.uk/php/pmb/minutes.php?latest

as well as being listed with other minutes at:

http://www.gridpp.ac.uk/php/pmb/minutes.php

Cheers, Dave.

-- 
________________________________________________________________________
Prof. David Britton                          GridPP Project Leader
Rm 480, Kelvin Building                      Telephone: +44 141 330 5454
School of Physics and Astronomy              Telefax: +44-141-330 5881
University of Glasgow                 EMail: [log in to unmask]
G12 8QQ, UK
________________________________________________________________________




















































GridPP PMB Minutes 415 (14.02.11) ================================= Present: John Gordon (Chair), Andrew Sansum, Steve Lloyd, Robin Middleton, Jeremy Coles, Pete Gronbech, Pete Clarke, Glenn Patrick, Dave Kelsey, Tony Cass (Suzanne Scott - Minutes) Apologies: Dave Britton, Tony Doyle, Roger Jones, Dave Colling, Neil Geddes 1. Spend Plan ============== DB had circulated an email regarding his discussion with Tony Medland. AS reported that he had done the Tier-1 Outturn Report and had increased the recurrent figure. Tony Medland had noted we should increase capital spend to our agreed limit. AS had a query re the recurrent: Given the uncertainty in the information from SSC T-M had suggested spending half of the remaing figure therefore AS needed to have the increase on recurrent approved. The other issue was that we needed to be clear that the capital and recurrent outturn being produced are accurate - this was difficult due to the new STFC SSC system at RAL, and AS needed to check all of the figures with the Finance section. ACTION 415.1 DK to check on the correct total allocation figure for both capital and recurrent with Tony Medland. 415.2 AS to clarify the outturn forecast with RAL finance section and organise the spend. 415.3 PG to follow-up with sites re their Tier-2 hardware spend from GridPP3. It was noted that the Tier-2 hardware spend in GridPP4 was still unknown. 415.4 DB to summarise the GridPP4 Tier-2 hardware spend in preparation for an email to Tony Medland. 415.5 Re the JeS forms for the second half of GridPP4, DB to chase this up during the next month or so. 2. Security Policy and glexec ============================== DK reported that this issue had arisen at the MB and the GDB - pilot jobs were still being run with no identity switching and this was in violation of Policy. The question was whether we extended the suspension of the policy. A detailed report had been given regarding sites' ability to identity- switch. DK reported that the conclusion had been that we weren't there yet, so the Policy exclusion had been extended for a short while. The Tier-0 and Tier-1 should be ready to do the identity switching around March 2011, however the Tier-2 had not really been looked at, but it was asked that they be ready by 30th June 2011. DK asked if EGI/NGI could assist with this, glexec and identity-switching? Could JC work with EGI operations to take this issue forward? JC noted that we had started on this a while ago, and he could discuss it at dTeam, it might be appropriate for sites starting this now to use Argus. ACTION 415.6 JC to bring up the issue of glexec and identity-switching at dTeam, Tier-2 sites to be ready by 30th June, it might be appropriate for sites starting to switch now, to use Argus. 3. Top-Level BDII plans ======================== AS reported that a couple of weeks ago he had been ticketed re volunteering to do top-level BDII for a global service, and had been given a detailed spec. He had responded yes, however discovered at the GDB recently that the plan was not as well developed as he had thought. There was no architectural plan as yet and the issue had not been well-thought-through. AS noted that clarification was required, as we couldn't sign-up for it at the moment until further information was provided. 4. Phenogrid Issues ======================== It was reported that Peter Richardson had asked for action in relation to problems pheno users had experienced with the grid recently. The main problem concerned proxy renewals; GridPP had not seen the issue with other VOs as they either do not use the same service(s) (combinations) or run shorter jobs that do not require job proxy renewal. JC noted that the problem was introduced with a software update but closing in on the main problem took a while due to two sites being involved (myproxy at RAL and WMS at Glasgow). JC reported that in a pheno testing phase, sites were initially responding quickly to tickets but then had mixed carry through as sites (correctly) did not consider problems with the WMS to be their responsibility to fix. It should be clear that there was no evidence of sites ignoring the concerns and the underlying problem was with middleware provided to GridPP. JC noted a separate problem in relation to the length of time some tickets remained outstanding, and he had listed his conclusions and recommendations in his circulated report; he suggested that open tickets should be reviewed in more detail after two weeks. In conclusion, following his in-depth investigation, JC concluded that Pheno have had a bad experience primarily with the WMS due to the proxy renewals issue that was only impacting them. They also were frustrated in their attempts to make progress by a VOMS certificate update that was not picked up by all site services. AS pointed out that Pheno may also be quite far down on sites' priority lists. JC reported that with Glasgow, for instance, Glasgow had thought that Pheno had problems with their site alone, not that Pheno were having problems throughout the UK, and so treated the reported incident as a user specific problem (shorter running jobs from the VO were running successfully). The escalation routes open to VOs (such as the weekly deployment meetings) need to be better publicised and ongoing issues within a Tier-2 made known earlier. JC would be having a meeting with Peter this week. ACTION 415.7 JC to follow-up the outcomes of his recent report on Phenogrid and begin to address changes to the way tickets are handled. 415.8 JC to review the Helpdesk and ascertain if tickets can be reviewed more accurately by personnel, who could look at ticket detail rather than length of time the ticket had been open. 5. F2F at Lancaster ==================== JG reminded that the next PMB was the F2F at Lancaster on 24th February. Could everyone who had not already done so, contact RJ regarding attendance and accommodation requirements. Apologies for non-attendance at the Lancaster F2F were noted in advance by: GP, RM, JG, DK, TC. ACTION 415.9 ALL: to contact RJ and advise attendance and accommodation requirements for the F2F at Lancaster. STANDING ITEMS ============== SI-1 Tier-1 Manager's Report ----------------------------- AS reported as follows: Fabric:    1) FY10 procurements - CPU tender - all delivered. Acceptance testing has started on V10 completed acceptance testing. CL10 now in acceptance testing. - Tape drive order placed (although follow up second order may be required). Media availability is now early March and we expect to place an order shortly. 2) Load test on SL08 has now started in order to reproduce the problems seen or to re-certify the hardware. 3) A tape fault was discovered that has resulted in the loss of 78 LHCb files. The fault was caused by a faulty tape drive that overwrote part of the data on the tape. A Post Mortem for this incident is in preparation at:  http://www.gridpp.ac.uk/wiki/RAL_Tier1_Incident_20110202_Tape_Data_Loss_LHCb 4) All except CASTOR Gen instance disk servers have now been upgraded to SL5 64bit. Service: 1) Summary of operational issues is at:     http://www.gridpp.ac.uk/wiki/Tier1_Operations_Report_2011-02-09 2) Bad checksum problem on CASTOR now resolved by upgrading gridftp server code. Now no requirement for upgrade of whole of CASTOR to 2.1.9-10. Certification of 2.1.10-0 has commenced. We are aiming to be able to deploy this into production late in March. 3) We are rolling out updates to disk server tcp tuning parameters, increasing default and max window sizes. 4) Updates to Oracle (PSU) were rolled out. SI-2 Production Manager's Report --------------------------------- JC reported as follows: 1) The VO share information published by sites is progressing. The current status on Friday is shown here: http://indico.cern.ch/getFile.py/access?contribId=4&resId=0&materialId=0&confId=113884. 2) The issues affecting the pheno VO have been reviewed (see report) and improvements that we can make in our support processes identified. The underlying problems are still related to problematic middleware (particularly the WMS). 3) A change to EGI repositories for CA trust anchors has prompted some discussion about WLCG/GridPP policy in this area. This will be discussed at the deployment team & sites meeting tomorrow. There ensued a discussion of communication from wLCG and notices from GridPP to sites to confirm changes that should be made, as sites were not often aware that action on their part was required. 4) The LHCOPN Tier-2 network working group have produced a new version of the LHC Open Network Environment (LHCONE) Architecture.v2.1 document. The goal of LHCONE is to provide a collection of access locations that are effectively entry points into a network that is private to the LHC T1/2/3 sites. LHCONE is not intended to supplant LHCOPN but rather to complement it. At this stage the document is helping to shape discussions on future networking - the GridPP position needs to be discussed. 5) The NEISS VO (http://www.geog.leeds.ac.uk/projects/neiss/about.php) would like to make use of the RAL based LFC. The technical requirements will be discussed tomorrow, but are there any in principal objections to supporting this VO? You may also recall that we recently setup the NA62 VO and it was almost implicit that they would require LFC enablement too. Can this VO be added to the “GridPP approved VOs” list if there are no technical objections tomorrow? [Aside: the approved list is used by the Tier-1 team to decide if a support request can be actioned]. The PMB approved this addition. 6) A management request following last week's MB was for sites to install ARGUS/glexec to a timeline of 31 March 2011 for T0/T1 and (the end of) June for T2s. ATLAS and ALICE still encounter problems using glexec. 7) At last week’s GDB (http://indico.cern.ch/conferenceDisplay.py?confId=106641) the transition from gLite to EMI-1 was discussed. There still seems some uncertainty on where the integration testing takes place and who “loads” the middleware repository used by WLCG sites. EMI-1 is due at the end of April. 8) The January WLCG Tier-2 availability/reliability report is now available: http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/201101/wlcg/WLCG_Tier2_Jan2011.pdf. Sites where we wanted to check on problems encountered were: QMUL (98%:83%) – availability down due to work on electrical supply to the machine room (prolonged by 1-day due to contractor availability). UCL (100%:79%) – problems on one CE that developed during the Christmas vacation were not fixed until returning to work in January. MAN-HEP (84%:84%) – The site had downtime to upgrade the site DPM. Reliability was down due to site-BDII problems (fixed with scripted restarts). BHAM-HEP (85%:82%) – suffered due to the site-BDII crashing. The component needed to be upgrades and an automatic restart implemented. SI-3 ATLAS weekly review & plans --------------------------------- RJ was absent. SI-4 CMS weekly review & plans ------------------------------- DC was absent. SI-5 LHCb weekly review & plans -------------------------------- GP reported on problems with pilot jobs at Glasgow due to a CE problem - this was now resolved. The ORACLE upgrade had gone ok. There had been an upgrade to the FTS at the Tier-1, but there were no new instances of the corrupted file issue, which was good. A new resource profile had been requested from each experiment, post-Chamonix, and LHCb were dealing with this. SI-6 User Co-ordination issues ------------------------------- GP noted nothing else to report; Phenogrid had already been discussed. SI-7 LCG Management Board report --------------------------------- JG noted that some of the issues had already been discussed; installed capacity was an ongoing issue. SI-8 Dissemination Report -------------------------- SL reported that Neasan O'Neill had provided a report as follows: Events: * EGI User Forum, there will be a UK NGI stand at the event, it is ~E450 shared between GridPP and NGS. * Royal Society Summer Exhibition, working with Karl Harrison and Cristina Lazzeroni on GridPP involvemnet in their stand "Discovering particles: from Rutherford scattering to the Large Hadron Collider" * IOP Nuclear and Particle Physics Divisional Conference, after some haggling we will have a stand at this too, it will be £454 (including my registration), waiting on invoice/cost of screen rental * Masterclass half day meeting, on Wednesday discussing the masterclasses and I'll be trying to get more grid into them * Big Bang Fair, in London next month with an IoP physics stand, grid demo on that each day * National Science and Engineering Week event, at QMUL also March, grid demo at that as well Materials: * Brochure - Invoice has been sent to Robin, Next version expected soon * Magic Cubes - Do we want to do new cubes and refresh the design? Origination is £300, 2.55 a cube for 1,000 and shipping was £200 in 2007. So 1,000 cubes would be £3050, 500 would be £2360 (3.72 a cube). RM reported that he had discussed this internally at RAL and any marketing costs overall for GridPP were fine provided they were under £25,000. This particular expenditure was approved for the Royal Society meeting but we may have to log the dissemination budget for GridPP4. Funding was agreed for the magic cubes at ~£3k. JG noted it would be useful to have new ones. SL suggested that if anything obviously required to be changed (eg: EGEE being mentioned) then it should be changed. We should go ahead and buy 1000 of them. This was agreed. News Items: * Sussex: could have something soon? * Have 4 items on Licensing in draft, would like comments if anyone is interested * Suggestions? Website: * RTM site has been cleaned up to reflect EGI/e-ScienceTalk involvment also moved the design over to the GridLoad graphs pages (http://gridportal-ws01.hep.ph.ic.ac.uk/gridload/) * Still working on the website review will have that by next PMB meeting. AOB === - PG reported that the Quarterly Reports had now been submitted and were uploaded for review. - SL reported that Frank Krauss ([log in to unmask]) would be taking over from Prof Nigel Glover as the Durham representative on the GridPP collaboration board. The next PMB meeting would be a F2F meeting and would take place at Lancaster on Thursday 24th. Advance apologies had been recorded for GP, RM, JG, DK, TC. There would be NO meeting next Monday 21st February. REVIEW OF ACTIONS ================= 398.7 Re the GridPP Security Policies - DK advised that EGI formal signoff had now been given, he would update the GridPP website pages. Ongoing. 400.4 SL to co-ordinate changing the current GridPP MoU towards an MoU for GridPP4. Ongoing. 409.1 JC to revisit document with a GridPP-NGI-NGS structure, not Dave Wallom’s. JG will provide input. Visions for today and for the future. Ongoing. 409.2 GP to produce new role description for the Chair of the UB. Ongoing. 411.1 DB to organise an Agenda around the theme of 'Efficiency' for GridPP26 at Sussex. Done, item closed. 411.3 SL to co-ordinate with RJ, DC, and GP, regarding monitoring site performance and distribution of GridPP4 funds, and provide a draft document to which the PMB could respond. This should be finalised at the F2F meeting in March, in relation to how much money was to be allocated. We would need a starting point by the F2F in February. SL was awaiting input from RJ and DC - they need to respond ASAP. SL reported that a meeting would be taking place next week. Done, item closed. 412.3 JG to check with AS and RJ re the issue of the Tier-1 continuing to provide LFC services (the issue here was extra effort, a proposal was required). Done, item closed. 413.1 RM to check the travel budget in relation to contributing to the costs of being involved with the Royal Society Summer Science Exhibition, in conjunction with Birmingham/Cambridge. Done, item closed. 413.2 DB to contact Karl Harrison and confirm GridPP's involvement in the Royal Society Exhibition, noting a contribution in terms of a possible demo, manpower, and promotional materials. Done, item closed. 413.3 JG to find out at the EGI meeting today if there was a GOCDB4 failover still in existence (the last one ended with EGEEIII). JG reported that this was on the Work Plan, but one was not in existence. This was being worked on at present. Done, item closed. 413.4 Regarding GSTAT2 publishing and sites filling-in the numbers as per SL's spreadsheet table showing the fraction (ie: publish the theoretical model in GSTAT) - PG to send the relevant spreadsheet to JC so that dTeam could progress this. Done, item closed. ACTIONS AS AT 14.02.11 ====================== 398.7 Re the GridPP Security Policies - DK advised that EGI formal signoff had now been given, he would update the GridPP website pages. 400.4 SL to co-ordinate changing the current GridPP MoU towards an MoU for GridPP4. 409.1 JC to revisit document with a GridPP-NGI-NGS structure, not Dave Wallom’s. JG will provide input. Visions for today and for the future. 409.2 GP to produce new role description for the Chair of the UB. 415.1 DK to check on the correct total allocation figure for both capital and recurrent with Tony Medland. 415.2 AS to clarify the outturn forecast with RAL finance section and organise the spend. 415.3 PG to follow-up with sites re their Tier-2 hardware spend from GridPP3. It was noted that the Tier-2 hardware spend in GridPP4 was still unknown. 415.4 DB to summarise the GridPP4 Tier-2 hardware spend in preparation for an email to Tony Medland. 415.5 Re the JeS forms for the second half of GridPP4, DB to chase this up during the next month or so. 415.6 JC to bring up the issue of glexec and identity-switching at dTeam, Tier-2 sites to be ready by 30th June, it might be appropriate for sites starting to switch now, to use Argus. 415.7 JC to follow-up the outcomes of his recent report on Phenogrid and begin to address changes to the way tickets are handled. JC to review the Helpdesk and ascertain if tickets can be reviewed more accurately by personnel, who could look at ticket detail rather than length of time the ticket had been open. 415.8 JC to review the Helpdesk and ascertain if tickets can be reviewed more accurately by personnel, who could look at ticket detail rather than length of time the ticket had been open. 415.9 ALL: to contact RJ and advise attendance and accommodation requirements for the F2F at Lancaster.

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

October 2017
September 2017
August 2017
May 2017
April 2017
March 2017
February 2017
January 2017
October 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
July 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
October 2013
August 2013
July 2013
June 2013
May 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
2006
2005
2004
2003
2002
2001
2000


WWW.JISCMAIL.AC.UK

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager