Hi Elena
That supports the suspicion of lcg-bdii.gridpp.ac.uk I suspect IC and
RALPP (and probably Bristol) were already set to look at alternate
BDIIs. The Glasgow BDII has recovered from whatever problem it
suffered earlier and now the Glasgow site is also passing again. Since
the LHC VO results are still fine, I only created a GGUS ticket with
top-priority (https://gus.fzk.de/ws/ticket_info.php?ticket=58990) - it
raises a question as to whether our ops people can anyway submit alarm
tickets to the T1 like the experiment ops people. I thought the T1
triggered a call out after 2 successive ops VO failures anyway and
since they are affected too....Something to discuss next week.
Cheers,
Jeremy
On 12 Jun 2010, at 16:27, Elena Korolkova wrote:
> Hi Jeremy
>
> I just changed LCG_GFAL_INFOSYS to "bdii.ce-egee.org and we passed
> the last SAM test.
>
> Elena
>
> ____________________________________________________________________________
> Dr Elena Korolkova
> Email: [log in to unmask]
> Tel.: +44 (0)114 2223553
> Fax: +44 (0)114 2223555
> Department of Physics and Astronomy
> University of Sheffield
> Sheffield, S3 7RH, United Kingdom
>
> On Sat, 12 Jun 2010, J Coles wrote:
>
>> Hi Wahid
>>
>> The history here shows problems for the Glasgow BDII but not lcg-bdii.gridpp.ac.uk
>> : http://pprc.qmul.ac.uk/~lloyd/gridpp/bdiitest.html.
>>
>> This view (from gstat2 that everyone at HEPSYSMAN yesterday will
>> know about): http://gstat-prod.cern.ch/gstat/service/bdii_top/treeview/lcg-bdii.gridpp.ac.uk/
>> also shows things to be okay (for now at least).
>>
>> There are some sites passing: http://pprc.qmul.ac.uk/~lloyd/gridpp/samtest.html
>> (i.e. IC ... RALPP) . All others fail with ERROR: CE-sft-lcg-rm-
>> rep with
>>
>> CRITICAL: METRIC FAILED [org.sam.WN-RepRep-/ops/Role=lcgadmin]:
>> CRITICAL: File was NOT replicated to SE samdpm002.cern.ch. [ErrDB:
>> [('lcg_util_wn', 'server', 'CRITICAL')]]
>> org.sam.WN-RepCr-/ops/Role=lcgadmin
>>
>> Since other countries do not see the problem I tend to agree that
>> it suggests a core UK problem, but the monitoring results are not
>> clear (for me at least). How come ralpp and IC continue to pass the
>> org.sam.WN-Rep-/ops/Role=lcgadmin service test? Perhaps one of the
>> on-duty people can comment as I must be missing something.
>>
>> Jeremy
>>
>>
>>
>>
>>
>> On 12 Jun 2010, at 09:42, Wahid Bhimji wrote:
>>
>>> Hi
>>> Looks like a number of sites are failing sam tests due to a
>>> problem with lcg-bdii.gridpp.ac.uk.
>>> Could someone take a look
>>> Ta
>>> Wahid
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
|