Print

Print


On Mon, Apr 5, 2010 at 9:55 AM, Martin Bly <[log in to unmask]> wrote:
> Tier1 Fabric and Tier1 Services are aware that the 5 top-level BDII servers at RAL are sporadically failing to respond with the required alacrity.  The data shows that this is due in part to a local imbalance in requests to the servers but mostly the machines concerned are under-powered now that data taking has begun.  Steps will be taken to remedy this issue during the week.
>
Just to check you are aware about a possible cause for the inbalence
in requests.

https://twiki.cern.ch/twiki/bin/view/LinuxSupport/GlibcDnsLoadBalancing

working around that is operationally tricky. Putting the hosts in
different subnets and/or sites
is the best but hardly convenient. Some of lcg-bdii.cern.ch will be
moving to different subnets
shortly to try and help with this.

> Martin.
>
> --
> Martin Bly
> RAL Tier1 Fabric Manager
>
>> -----Original Message-----
>> From: Testbed Support for GridPP member institutes [mailto:TB-
>> [log in to unmask]] On Behalf Of John Bland
>> Sent: Monday, April 05, 2010 1:46 AM
>> To: [log in to unmask]
>> Subject: lcg-bdii.gridpp.ac.uk problems
>>
>> Hi,
>>
>> Liverpool have failed a number of SAM tests over the last day or so,
>> which seem to have all resulted from missing entries in BDII or being
>> unable to connect to lcg-bdii.gridpp.ac.uk.
>>
>> Other sites have had a number of failed tests over the same period.
>> We've also seen a number of temporary blips in our SAM tests over the
>> past few weeks, resulting from a lack of entries for our SE in BDII.
>> We've been unsure if this is anything to do with our glite 3.2 BDII or
>> a
>> flaky VM system.
>>
>> As some of the recent tests have included lack of entries for external
>> systems as well I'm wondering if the GridPP BDII server is experiencing
>> problems for other sites as well and if other Tier2 sites without
>> problems are using a different top BDII.
>>
>> Should we all be using the same BDII or is there something to be gained
>> from a *coordinated* distribution of external services to mitigate the
>> otherwise single point of failure that BDIIs can represent within the
>> UK
>> cloud?
>>
>> John
>>
>> --
>> Dr John Bland, M.Phys.(Hons), Ph.D. (Liverpool)
>> Email: [log in to unmask]
>> Phone: 0151 256 7055, Mobile: 07794 935 213
>> Web  : http://www.third-bird.co.uk/photography/
>> "Happy Happy Joy Joy Joy!" - Stimpy
>



-- 
Steve Traylen