Hi
The problem is that the whole system was set up in a rush. ATLAS did
not define how things were meant to be set up. A lot of trial and
error had to be done initially by sites. The setup we currently have
works but is a mess. Now that the system has been functioning for a
while ATLAS have had time to sit down and think about the best way to
do things and the strong recommendation from the experts was to use
the Tier 1 as a primary backup. Rather than just making this change
behind sites backs I was hoping that the majority of sites would
agree with the change. I have however managed to achieve the exact
opposite; convince people that the change is not really necessary and
is only for the "bad" sites. Something for me to do better next time!
I do intend to add the new monitoring and switch over sites that want
this. I am hoping that this will demonstrate that it is actually a
good idea for everybody. Its not difficult to switch sites over
later. The one thing this has at least achieved is to make sites
aware of the issue, so hopefully the SAM test results will get fixed
whatever the method.
Alastair
On 18 Jun 2010, at 01:40, Ewan MacMahon wrote:
>> -----Original Message-----
>> From: Testbed Support for GridPP member institutes [mailto:TB-
>>
>> After a discussion in today Thursday phone meeting we have decided
>> the following:
>>
>> 1) If you have been passing the SAM tests and are happy with your
>> current setup then no changes will be made to effect your site.
>> 2) If you have been failing (getting a warning) on the SAM tests I
>> will switch you over to having the RAL Tier 1 as your primary backup.
>>
> This seems to me to be completely the wrong approach. If you want to
> switch everyone to RAL because you think that's going to be a
> technically
> better solution then that's fair enough, but it doesn't make sense to
> base any part of the decision based on the test results when they're
> essentially random.
>
> The problem with the tests is that you're testing something without
> ever having asked the sites to set it up the way you wanted. It's
> not a surprise at that point that some things are not set up the
> way you wanted.
>
> I don't have any strong views on whether it's better to use Tier 2
> to Tier 2 failover, or have everyone drop back to RAL, but it does
> seem silly to have a mixture with an individual site's behavior
> determined this way.
>
> Ewan
|