> -----Original Message-----
> From: Testbed Support for GridPP member institutes
> [mailto:[log in to unmask]] On Behalf Of Graeme Stewart
> Sent: Tuesday, May 20, 2008 10:52 PM
> To: [log in to unmask]
> Subject: Re: New LCG CA release 1.21: breaks site
>
> The only thing I could see that was different between Glasgow
> and Durham was that Glasgow's servers were using certificates
> signed by the older CA; Durham's by the new. These will have
> different CRLs, which might explain a different behaviour?
Um, it explained a lot. The older CA (not the one just been replaced) and
the certificates issued by this CA is *NOT* affected by the CA upgrade.
Literally at the moment, UK has two PKI, the older one does not issue new
certificate any more but still exit and valid, another one is the one which
just been replaced. Only certificates issued by the replaced CA will be
affected by the mismatch of CRL/CA. Now the picture is getting clearer.
> But as Phil said, we were changing things on the CE and then
> suddenly it started to work - what exactly caused the
> problems to dissapear we can't say (not if it was anything
> which we did...)
Please bear it in mind, apart from your own site, all other
sites/servers/services need to be upgraded both CA and CRL too, otherwise,
just like the VOMS server at CERN, some UK users will fail to authenticate
themselves against the VOMS server. In this case, there is nothing wrong on
your side, simply because the server side does not correctly install CA and
CRL.
Cheers,
Mingchao
>
> Cheers
>
> Graeme
>
> On Tue, May 20, 2008 at 6:51 PM, Kelsey, DP (David)
> <[log in to unmask]> wrote:
> > Graeme, Simon,
> >
> > Is it now understood why Durham and RHUL were experiencing problems
> > and how it was fixed?
> >
> > Dave
> >
> >
> > ------------------------------------------------
> > Dr David Kelsey
> > Particle Physics Department
> > Rutherford Appleton Laboratory
> > Chilton, DIDCOT, OX11 0QX, UK
> >
> > e-mail: [log in to unmask]
> > Tel: [+44](0)1235 445746 (direct)
> > Fax: [+44](0)1235 446733
> > ------------------------------------------------
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: Testbed Support for GridPP member institutes
> >> [mailto:[log in to unmask]] On Behalf Of Graeme Stewart
> >> Sent: 19 May 2008 23:29
> >> To: [log in to unmask]
> >> Subject: Re: New LCG CA release 1.21: breaks site
> >>
> >> Hi John
> >>
> >> That was a RAID failure on their SE - not related.
> >>
> >> Having forced a CRL update across the Durham cluster they
> are still
> >> failing SAM tests, so we don't seem to be out of the woods yet...
> >>
> >> g
> >>
> >> On Mon, May 19, 2008 at 11:11 PM, Gordon, JC (John)
> >> <[log in to unmask]> wrote:
> >> > Graeme, do we know that this was CA related? Durham were faiiing
> >> > overnight Sunday too.
> >> >
> >> > John
> >> >
> >> >> -----Original Message-----
> >> >> From: Testbed Support for GridPP member institutes
> >> >> [mailto:[log in to unmask]] On Behalf Of Graeme Stewart
> >> >> Sent: 19 May 2008 22:00
> >> >> To: [log in to unmask]
> >> >> Subject: Re: New LCG CA release 1.21: breaks site
> >> >>
> >> >> On Mon, May 19, 2008 at 8:54 PM, Jensen, J (Jens)
> >> <[log in to unmask]>
> >> >> wrote:
> >> >> > Hi Graeme,
> >> >> >
> >> >> > I know for the Moz NSS bug, it is because as part of the SSL
> >> >> > negotiation, the server (or client, doesn't matter) sends
> >> >> its trusted
> >> >> > certificates to the peer saying "look this is my cert" and
> >> >> the peer says "wot? I thought it looked like this?"
> >> >> >
> >> >> > But OpenSSL and stuff derived from OpenSSL does not work
> >> like this;
> >> >> > they may or may not send intermediate certificates in the
> >> >> negotiation
> >> >> > but all that matters is that the trust chain can be built,
> >> >> which of course they can be either way.
> >> >> >
> >> >> > Maybe it's something more obvious. Like CRLs that
> haven't been
> >> >> > refreshed when you install the 1.21 release. You folk in
> >> >> Glasgow have
> >> >> > probably been Good Eggs(tm) as usual and refreshed your CRLs.
> >> >>
> >> >> I upgraded one UI first (not our main one) and checked
> >> that fetch-crl
> >> >> worked - so that there was nothing basically wrong with the CA
> >> >> release. Then, after I had upgraded the CE I refreshed
> the CRLs by
> >> >> hand. Because of the way our site infrastruture works all
> >> the other
> >> >> machines then copy their CRLs from the CE (via a simple
> >> mirror - no
> >> >> complicated SSL thingamybobs...).
> >> >>
> >> >> I can actually tell when Durham broke from the ATLAS pilot
> >> submission
> >> >> logs:
> >> >>
> >> >> http://svr017.gla.scotgrid.ac.uk/factory/logs/2008-05-19/ce01.
> >> >> dur.scotgrid.ac.uk_2119_jobmanager-lcgpbs-q3d/SubmissionLog
> >> >>
> >> >> I should say they broke for my submission before I had touched
> >> >> anything at Glasgow re. the update.
> >> >>
> >> >> I now see a very weird effect. I can globus job run from
> >> one Glasgow
> >> >> UI to Durham ok, but not from the other...
> >> >>
> >> >> g
> >> >>
> >> >
> >>
> >
>
|