On 16/06/10 11:04, Christopher J.Walker wrote:
> I strongly agree that there is a tendency in the grid to add extra
> layers of complexity in order to bodge around problems - rather than
> actually doing anything to fix the problems. This is a bad thing. Even
> worse, this also tends to hide the original problem - and nobody notices
> until the failover mechanism starts failing as well - and that makes the
> whole thing more difficult to debug.
>
> I also think that we should move towards the situation where RAL isn't a
> single point of failure for UK Tier-2s.
>
> However, in this case, I'm (weakly) inclined towards RAL being the
> failover. That means I don't have to worry about squid configs allowing
> traffic into QMUL. It also means that the failover site is likely to get
> exercised on a reasonably regular basis - 20 times as often as an
> individual Tier-2 being failed over to.
I think this makes sense. Compared to the existing system of failovers
between pairs of T2 sites this is reducing rather than increasing the
complexity. Acting as a failover doesn't make RAL a single point of
failure, in this case at least.
Ben
>
> Chris
--
Dr Ben Waugh Tel. +44 (0)20 7679 7223
Dept of Physics and Astronomy Internal: 37223
University College London
London WC1E 6BT
|