Graeme, Simon,
Is it now understood why Durham and RHUL were experiencing problems and
how it was fixed?
Dave
------------------------------------------------
Dr David Kelsey
Particle Physics Department
Rutherford Appleton Laboratory
Chilton, DIDCOT, OX11 0QX, UK
e-mail: [log in to unmask]
Tel: [+44](0)1235 445746 (direct)
Fax: [+44](0)1235 446733
------------------------------------------------
> -----Original Message-----
> From: Testbed Support for GridPP member institutes
> [mailto:[log in to unmask]] On Behalf Of Graeme Stewart
> Sent: 19 May 2008 23:29
> To: [log in to unmask]
> Subject: Re: New LCG CA release 1.21: breaks site
>
> Hi John
>
> That was a RAID failure on their SE - not related.
>
> Having forced a CRL update across the Durham cluster they are
> still failing SAM tests, so we don't seem to be out of the
> woods yet...
>
> g
>
> On Mon, May 19, 2008 at 11:11 PM, Gordon, JC (John)
> <[log in to unmask]> wrote:
> > Graeme, do we know that this was CA related? Durham were faiiing
> > overnight Sunday too.
> >
> > John
> >
> >> -----Original Message-----
> >> From: Testbed Support for GridPP member institutes
> >> [mailto:[log in to unmask]] On Behalf Of Graeme Stewart
> >> Sent: 19 May 2008 22:00
> >> To: [log in to unmask]
> >> Subject: Re: New LCG CA release 1.21: breaks site
> >>
> >> On Mon, May 19, 2008 at 8:54 PM, Jensen, J (Jens)
> <[log in to unmask]>
> >> wrote:
> >> > Hi Graeme,
> >> >
> >> > I know for the Moz NSS bug, it is because as part of the SSL
> >> > negotiation, the server (or client, doesn't matter) sends
> >> its trusted
> >> > certificates to the peer saying "look this is my cert" and
> >> the peer says "wot? I thought it looked like this?"
> >> >
> >> > But OpenSSL and stuff derived from OpenSSL does not work
> like this;
> >> > they may or may not send intermediate certificates in the
> >> negotiation
> >> > but all that matters is that the trust chain can be built,
> >> which of course they can be either way.
> >> >
> >> > Maybe it's something more obvious. Like CRLs that haven't been
> >> > refreshed when you install the 1.21 release. You folk in
> >> Glasgow have
> >> > probably been Good Eggs(tm) as usual and refreshed your CRLs.
> >>
> >> I upgraded one UI first (not our main one) and checked
> that fetch-crl
> >> worked - so that there was nothing basically wrong with the CA
> >> release. Then, after I had upgraded the CE I refreshed the CRLs by
> >> hand. Because of the way our site infrastruture works all
> the other
> >> machines then copy their CRLs from the CE (via a simple
> mirror - no
> >> complicated SSL thingamybobs...).
> >>
> >> I can actually tell when Durham broke from the ATLAS pilot
> submission
> >> logs:
> >>
> >> http://svr017.gla.scotgrid.ac.uk/factory/logs/2008-05-19/ce01.
> >> dur.scotgrid.ac.uk_2119_jobmanager-lcgpbs-q3d/SubmissionLog
> >>
> >> I should say they broke for my submission before I had touched
> >> anything at Glasgow re. the update.
> >>
> >> I now see a very weird effect. I can globus job run from
> one Glasgow
> >> UI to Durham ok, but not from the other...
> >>
> >> g
> >>
> >
>
|