Hi,
I arrive only now to read this. Manchester had the same failures but we
have a local top bdii so it couldn't be a RAL problem only.
Source SRM Request Token: 8ab284a1-7edc-41ed-b8b9-f5f62518cca7
[BDII][][] top-bdii.tier2.hep.manchester.ac.uk: No entries for host:
samdpm001.cern.ch
lcg_rep: Invalid argument
cheers
alessandra
J Coles wrote:
> Hi Elena
>
> That supports the suspicion of lcg-bdii.gridpp.ac.uk I suspect IC and
> RALPP (and probably Bristol) were already set to look at alternate
> BDIIs. The Glasgow BDII has recovered from whatever problem it
> suffered earlier and now the Glasgow site is also passing again. Since
> the LHC VO results are still fine, I only created a GGUS ticket with
> top-priority (https://gus.fzk.de/ws/ticket_info.php?ticket=58990) - it
> raises a question as to whether our ops people can anyway submit alarm
> tickets to the T1 like the experiment ops people. I thought the T1
> triggered a call out after 2 successive ops VO failures anyway and
> since they are affected too....Something to discuss next week.
>
> Cheers,
> Jeremy
>
> On 12 Jun 2010, at 16:27, Elena Korolkova wrote:
>
>> Hi Jeremy
>>
>> I just changed LCG_GFAL_INFOSYS to "bdii.ce-egee.org and we passed
>> the last SAM test.
>>
>> Elena
>>
>> ____________________________________________________________________________
>>
>> Dr Elena Korolkova
>> Email: [log in to unmask]
>> Tel.: +44 (0)114 2223553
>> Fax: +44 (0)114 2223555
>> Department of Physics and Astronomy
>> University of Sheffield
>> Sheffield, S3 7RH, United Kingdom
>>
>> On Sat, 12 Jun 2010, J Coles wrote:
>>
>>> Hi Wahid
>>>
>>> The history here shows problems for the Glasgow BDII but not
>>> lcg-bdii.gridpp.ac.uk:
>>> http://pprc.qmul.ac.uk/~lloyd/gridpp/bdiitest.html.
>>>
>>> This view (from gstat2 that everyone at HEPSYSMAN yesterday will
>>> know about):
>>> http://gstat-prod.cern.ch/gstat/service/bdii_top/treeview/lcg-bdii.gridpp.ac.uk/
>>> also shows things to be okay (for now at least).
>>>
>>> There are some sites passing:
>>> http://pprc.qmul.ac.uk/~lloyd/gridpp/samtest.html (i.e. IC ...
>>> RALPP) . All others fail with ERROR: CE-sft-lcg-rm-rep with
>>>
>>> CRITICAL: METRIC FAILED [org.sam.WN-RepRep-/ops/Role=lcgadmin]:
>>> CRITICAL: File was NOT replicated to SE samdpm002.cern.ch.
>>> [ErrDB:[('lcg_util_wn', 'server', 'CRITICAL')]]
>>> org.sam.WN-RepCr-/ops/Role=lcgadmin
>>>
>>> Since other countries do not see the problem I tend to agree that it
>>> suggests a core UK problem, but the monitoring results are not clear
>>> (for me at least). How come ralpp and IC continue to pass the
>>> org.sam.WN-Rep-/ops/Role=lcgadmin service test? Perhaps one of the
>>> on-duty people can comment as I must be missing something.
>>>
>>> Jeremy
>>>
>>>
>>>
>>>
>>>
>>> On 12 Jun 2010, at 09:42, Wahid Bhimji wrote:
>>>
>>>> Hi
>>>> Looks like a number of sites are failing sam tests due to a problem
>>>> with lcg-bdii.gridpp.ac.uk.
>>>> Could someone take a look
>>>> Ta
>>>> Wahid
>>>> --
>>>> The University of Edinburgh is a charitable body, registered in
>>>> Scotland, with registration number SC005336.
--
The most effective way to do it, is to do it. (Amelia Earhart)
Northgrid Tier2 Technical Coordinator
http://www.hep.manchester.ac.uk/computing/tier2
|