Inline....
Martin.
--
Martin Bly
RAL Tier1 Fabric Manager
> -----Original Message-----
> From: Testbed Support for GridPP member institutes [mailto:TB-
> [log in to unmask]] On Behalf Of Christopher J.Walker
> Sent: Tuesday, April 06, 2010 2:13 PM
> To: [log in to unmask]
> Subject: Re: lcg-bdii.gridpp.ac.uk problems
>
> Steve Traylen wrote:
> > On Mon, Apr 5, 2010 at 9:55 AM, Martin Bly <[log in to unmask]>
wrote:
> >> Tier1 Fabric and Tier1 Services are aware that the 5 top-level BDII
servers
> at RAL are sporadically failing to respond with the required alacrity.
The
> data shows that this is due in part to a local imbalance in requests
to the
> servers but mostly the machines concerned are under-powered now that
data
> taking has begun. Steps will be taken to remedy this issue during the
week.
> >>
> > Just to check you are aware about a possible cause for the inbalence
> > in requests.
> >
> >
https://twiki.cern.ch/twiki/bin/view/LinuxSupport/GlibcDnsLoadBalancing
> >
>
>
> Just for reference, in Debian, the matter was raised to the technical
> committee in 2007 and they decided as follows:
>
>
> :>
>
[log in to unmask]" target="_blank">http:[log in to unmask].
html
> :>
> :> Bug#438179: RFC3484 s6 rule 9 should not apply
> :>
> :[snip]>
> :> The Technical Committee has decided as follows:
> :>
> :> 1. RFC3484 s6 rule 9 should not be applied to IPv4 addresses
> :> by Debian systems, and we DO overrule the maintainer.
> :> 2. RFC3484 s6 rule 9 should not be applied to IPv6 addresses
> :> by Debian systems. We do NOT overrule the maintainer.
> :> 3. We recommend to the IETF that RFC3484 s6 rule 9 should be
> :> abolished for IPv4, and that it should be reconsidered for
IPv6.
> :>
> :> The supermajority requirement for overruling the maintainer was
met.
> :>
>
> > working around that is operationally tricky. Putting the hosts in
> > different subnets and/or sites
> > is the best but hardly convenient. Some of lcg-bdii.cern.ch will be
> > moving to different subnets
> > shortly to try and help with this.
>
> I don't know how much of a problem this is - and it sounds like it
isn't
> the issue in the bdii case, but would it be worth considering putting
a
> similar patch in SL?
The network stats we have of access to the five systems seem to show
this not an issue for us - the load is fairly even. I guess most of the
queries are from external sites that don't match our subnet at all so
the preferential prefix sorting doesn't have as big an effect as it
might.
Anyway, we hope to fix this on Thursday with some better systems.
Martin.
> Chris
> >
> >> Martin.
> >>
> >> --
> >> Martin Bly
> >> RAL Tier1 Fabric Manager
> >>
> >>> -----Original Message-----
> >>> From: Testbed Support for GridPP member institutes [mailto:TB-
> >>> [log in to unmask]] On Behalf Of John Bland
> >>> Sent: Monday, April 05, 2010 1:46 AM
> >>> To: [log in to unmask]
> >>> Subject: lcg-bdii.gridpp.ac.uk problems
> >>>
> >>> Hi,
> >>>
> >>> Liverpool have failed a number of SAM tests over the last day or
so,
> >>> which seem to have all resulted from missing entries in BDII or
being
> >>> unable to connect to lcg-bdii.gridpp.ac.uk.
> >>>
> >>> Other sites have had a number of failed tests over the same
period.
> >>> We've also seen a number of temporary blips in our SAM tests over
the
> >>> past few weeks, resulting from a lack of entries for our SE in
BDII.
> >>> We've been unsure if this is anything to do with our glite 3.2
BDII or
> >>> a
> >>> flaky VM system.
> >>>
> >>> As some of the recent tests have included lack of entries for
external
> >>> systems as well I'm wondering if the GridPP BDII server is
experiencing
> >>> problems for other sites as well and if other Tier2 sites without
> >>> problems are using a different top BDII.
> >>>
> >>> Should we all be using the same BDII or is there something to be
gained
> >>> from a *coordinated* distribution of external services to mitigate
the
> >>> otherwise single point of failure that BDIIs can represent within
the
> >>> UK
> >>> cloud?
> >>>
> >>> John
> >>>
> >>> --
> >>> Dr John Bland, M.Phys.(Hons), Ph.D. (Liverpool)
> >>> Email: [log in to unmask]
> >>> Phone: 0151 256 7055, Mobile: 07794 935 213
> >>> Web : http://www.third-bird.co.uk/photography/
> >>> "Happy Happy Joy Joy Joy!" - Stimpy
> >
> >
> >
|