Tier1 Fabric and Tier1 Services are aware that the 5 top-level BDII servers at RAL are sporadically failing to respond with the required alacrity. The data shows that this is due in part to a local imbalance in requests to the servers but mostly the machines concerned are under-powered now that data taking has begun. Steps will be taken to remedy this issue during the week.
Martin.
--
Martin Bly
RAL Tier1 Fabric Manager
> -----Original Message-----
> From: Testbed Support for GridPP member institutes [mailto:TB-
> [log in to unmask]] On Behalf Of John Bland
> Sent: Monday, April 05, 2010 1:46 AM
> To: [log in to unmask]
> Subject: lcg-bdii.gridpp.ac.uk problems
>
> Hi,
>
> Liverpool have failed a number of SAM tests over the last day or so,
> which seem to have all resulted from missing entries in BDII or being
> unable to connect to lcg-bdii.gridpp.ac.uk.
>
> Other sites have had a number of failed tests over the same period.
> We've also seen a number of temporary blips in our SAM tests over the
> past few weeks, resulting from a lack of entries for our SE in BDII.
> We've been unsure if this is anything to do with our glite 3.2 BDII or
> a
> flaky VM system.
>
> As some of the recent tests have included lack of entries for external
> systems as well I'm wondering if the GridPP BDII server is experiencing
> problems for other sites as well and if other Tier2 sites without
> problems are using a different top BDII.
>
> Should we all be using the same BDII or is there something to be gained
> from a *coordinated* distribution of external services to mitigate the
> otherwise single point of failure that BDIIs can represent within the
> UK
> cloud?
>
> John
>
> --
> Dr John Bland, M.Phys.(Hons), Ph.D. (Liverpool)
> Email: [log in to unmask]
> Phone: 0151 256 7055, Mobile: 07794 935 213
> Web : http://www.third-bird.co.uk/photography/
> "Happy Happy Joy Joy Joy!" - Stimpy
|