On Nov 15, 2007, at 11:53 AM, brian davies wrote:
> Excuse my ignorance, but is there a list published somewhere from VOs
> which sites and why are "banned" by them?
> Brian
>
Hi Brian,
You can get information from:
https://lcg-fcr.cern.ch:8443/fcr/fcr.cgi
though before you say I agree it is not intuitive.
Also Imperial used to supply a more readable list here:
https://gfe03.hep.ph.ic.ac.uk:4175/fcr.html
though is now vanished. (Ask them)
Finally watch this space, this information is definetly on the wish
list of
the information sites have requested to be pumped into their local
fabric
monitoring in the course of the WLCG monitoring group.
Steve
> On 15/11/2007, brian davies <[log in to unmask]> wrote:
>> Excuse my ignorance, but is there a list published somewhere from VOs
>> which sites and why are "banned" by them?
>> Brian
>>
>> On 15/11/2007, Simone Campana <[log in to unmask]> wrote:
>>>
>>>
>>>
>>> Hello.
>>>
>>> Sort of. The script considers only the CEs which accept the ATLAS
>>> VO and
>>> since yesterday I consider this issue critical and therefore
>>> banned from
>>> ATLAS production the sites still affected. When you do this in
>>> FCR, the way
>>> the banning is done is removing the ACBR VO:ATLAS from the CE.
>>> Therefore
>>> the script does not check those CEs anymore. We should therefore
>>> run the
>>> script against some BDII which is not filtered by FCR, like the
>>> one user for
>>> SAM tests.
>>>
>>> Having said this, it is true that the situation had a big
>>> improvement in the
>>> last week, thanks to the ticketing action. The currently banned
>>> sites are
>>> are about 10, while 1 week ago there were 60 problematic sites.
>>>
>>>
>>>
>>>
>>> _____________________________________________
>>> From: Antonio Retico
>>> Sent: Thursday, November 15, 2007 11:26 AM
>>> To: Nicholas Thackray; grid-operations-meeting (Attendees of the
>>> Grid
>>> Operations Meetings); LHC Computer Grid - Rollout
>>> Subject: RE: Minutes and actions from the WLCG-OSG-EGEE Grid
>>> Operations
>>> Meeting of 12 Nov 2007
>>>
>>>
>>>
>>>
>>> hi Nick,
>>>
>>> about action 67 and the comment from Alessandro and Simone
>>> reported in the
>>> minutes:
>>>
>>> "There is a .txt file with a query you should run on the BDII to
>>> gather the
>>> relevant info. In addition there is a python script which fetches
>>> info from
>>> the output of the query and generates and output, where sites
>>> marked with
>>> ==> are the problematic ones. "
>>>
>>> I suspect that the attached log was still the one from previous
>>> week. The
>>> situation, as reported by the same script, looks now considerably
>>> better.
>>> Most of the GGUS ticket we opened were closed and
>>>
>>> today I see
>>>
>>> [aretico@lxplus204 VOVIEW] python
>>> /afs/cern.ch/user/c/campanas/public/VOVIEW/VOViewsConsist.py
>>> /tmp/a.a | grep "===>"
>>>
>>> ===>
>>> CE:bigmac-lcg-ce.physics.utoronto.ca:2119/jobmanager-lcgcondor-atlas
>>> TOTrun:9 TOTVOrun:0 TOTwait:0 TOTVOwait:4444
>>>
>>> ===> CE:ce04-lcg.cr.cnaf.infn.it:2119/blah-lsf-atlas
>>> TOTrun:170 TOTVOrun:0 TOTwait:0 TOTVOwait:0
>>>
>>> Traceback (most recent call last):
>>>
>>> File
>>> "/afs/cern.ch/user/c/campanas/public/VOVIEW/VOViewsConsist.py",
>>> line 12, in ?
>>>
>>> TOTVOrun += int(a)
>>>
>>> ValueError: invalid literal for int(): _UNDEF_
>>>
>>> ===> CE:epgce2.ph.bham.ac.uk:2119/jobmanager-lcgpbs-short
>>> TOTrun:0 TOTVOrun:0 TOTwait:4444 TOTVOwait:57772
>>>
>>> Three sites still failing, on of which with a generic error.
>>>
>>> I would be rather oriented to close the action, if Atlas agrees,
>>> of course.
>>>
>>> Thanks.
>>>
>>> Antonio
>>>
>>>
>>>
>>>
>>>
>>>
>>> _____________________________________________
>>>
>>> From: Nicholas Thackray
>>>
>>> Sent: Thursday, November 15, 2007 10:19 AM
>>>
>>> To: grid-operations-meeting (Attendees of the Grid Operations
>>> Meetings);
>>> LHC Computer Grid - Rollout
>>>
>>> Subject: Minutes and actions from the WLCG-OSG-EGEE Grid
>>> Operations
>>> Meeting of 12 Nov 2007
>>>
>>> Dear all
>>>
>>>
>>> The minutes and action list from the Grid Operations meeting of
>>> Monday 12th
>>> November can be found attached to the agenda page, here:
>>>
>>> http://indico.cern.ch/conferenceDisplay.py?confId=23786
>>>
>>> In particular, please check the list of attendees as there are
>>> some names
>>> missing. Send any comments or corrections directly to me.
>>>
>>>
>>> The next meeting will be on Monday, 19th November 2007 15:00 UTC
>>> (16:00
>>> Swiss local time)
>>>
>>>
>>>
>>>
>>> Best regards,
>>>
>>>
>>>
>>> Nick Thackray
>>>
>>>
>>
--
Steve Traylen
[log in to unmask]
|