Hi,
I've just been rereading Jeremy's email. The Imperial CE appears green
in both links he includes, can someone point me to the error, please ?
Daniela
2009/7/21 Coles, J (Jeremy) <[log in to unmask]>:
> Dear All
>
> I've been made aware of an issue that affects many UKI sites running older versions of GFAL and lcg_utils. Specific sites in UKI that should read the following carefully are:
>
> RAL-LCG2 lcgce02.gridpp.rl.ac.uk; lcgce03.gridpp.rl.ac.uk; lcgce04.gridpp.rl.ac.uk; lcgce05.gridpp.rl.ac.uk
> UKI-LT2-IC-HEP ce00.hep.ph.ic.ac.uk
> UKI-LT2-RHUL ce1.pp.rhul.ac.uk
> UKI-LT2-UCL-HEP lcg-ce01.hep.ucl.ac.uk
> UKI-NORTHGRID-MAN-HEP ce01.tier2.hep.manchester.ac.uk
> UKI-NORTHGRID-MAN-HEP ce02.tier2.hep.manchester.ac.uk
> UKI-SCOTGRID-ECDF ce.glite.ecdf.ed.ac.uk
> UKI-SCOTGRID-ECDF mw05.ecdf.ed.ac.uk
>
> There is likely to be an urgent request to upgrade in the coming days. We can review the situation at the UKI meeting on Thursday (http://indico.cern.ch/conferenceDisplay.py?confId=64531).
>
> If you are aware of reasons an upgrade is not possible for GFAL and lcg_utils then please reply to the list.
>
> Many thanks,
> Jeremy
>
>
> ---------- Forwarded message ----------
> From: Maarten Litmaath <[log in to unmask]>
> Date: Mon, Jul 20, 2009 at 9:24 PM
> Subject: new DPM for SAM causes failures at 10% of the sites !!
>
>
> Hi all,
> the SAM CE tests include the replication of a file from the site's
> default SE for "ops" to a central SE, for which lxdpm104.cern.ch
> is the default choice and thereby almost always used.
>
> The problem with lxdpm104 is its OS: it still runs SLC3, which is
> no longer supported. Tony Cass is not happy with this situation.
>
> We already upgraded the spare nodes lxdpm101 and lxdpm103 to SLC4
> and the latest DPM version for gLite 3.1.
>
> lxdpm101 is used as the central SE for the SAM validation instance,
> where we see about 10% more of the sites failing, compared to the
> production instance. There are some big sites included.
>
> Production:
>
> https://lcg-sam.cern.ch:8443/sam/sam.py?sensors=CE®ions=AsiaPacific®ions=CERN®ions=CentralEurope®ions=France®ions=GermanySwitzerland®ions=Italy®ions=NorthernEurope®ions=Russia®ions=SouthEasternEurope®ions=SouthWesternEurope®ions=UKI&vo=ops&order=RegionName&funct=ShowSensorTests
>
> Validation:
>
> https://sam-val.cern.ch:8443/sam/sam.py?sensors=CE®ions=AsiaPacific®ions=CERN®ions=CentralEurope®ions=France®ions=GermanySwitzerland®ions=Italy®ions=NorthernEurope®ions=Russia®ions=SouthEasternEurope®ions=SouthWesternEurope®ions=UKI&vo=ops&order=RegionName&funct=ShowSensorTests
>
> As I am writing this, production has 342 green CEs, validation 307.
>
> There are many failures in particular in Italy and UKI.
> The error usually is as follows:
>
> -------------------------------------------------------------------------
> Both SAPath and SARoot are not set about ops VO and SE : lxdpm101.cern.ch
> lcg_rep: Invalid argument
> -------------------------------------------------------------------------
>
> The cause of that error becomes clear when we look at the versions of
> GFAL and lcg_utils that are present on the WN. For example, at RAL:
>
> -------------------------------------------------------------------------
> Using lcg-utils version:
>
> + lcg-cp --version
> lcg_util-1.6.11
> GFAL-client-1.10.11
> -------------------------------------------------------------------------
>
> That version is more than a year old and cannot handle the way
> lxdpm101.cern.ch now is published in the info system (I verified that).
>
> Conclusion: I think all sites that fail in the SAM validation instance
> need to be told to upgrade their WNs to the latest version _URGENTLY_.
>
> We should give them a deadline by which time we switch the production
> instance to lxdpm101.cern.ch.
> Thanks,
> Maarten
>
>
>
> --
> Steve Traylen
> --
> Scanned by iCritical.
>
--
-----------------------------------------------------------
HEP Group
Physics Dep
Imperial College
Tel: +44-(0)20-75947810
http://www.hep.ph.ic.ac.uk/~dbauer/
|