yes, it works:
[Switcher2 AutoExclusion] Summary for UK cloud at 2014-06-18 09:20 UTC
Dear UK Cloud Support,
please note that following PanDA Site IDs have been excluded/recovered by the AutoExclusion tool:
Site UKI-SOUTHGRID-RALPP:
SiteID: ANALY_RALPP_SL6
Status changed to 'settest'
Reason:
Downtime over
SiteID: UKI-SOUTHGRID-RALPP_SL6
Status changed to 'settest'
Reason:
Downtime over
PanDA Site ID recovery information:
Recovery will be performed by the HammerCloud PFT/AFT. Once recovery notification is sent you may check progress of test jobs at [1] (PFT) or [2] (AFT). Summary of HC exclusion/recovery actions is at [3].
On 18 Jun 2014, at 10:13, Elena Korolkova <[log in to unmask]> wrote:
> I¡¯ve added heplnv147.pp.rl.ac.uk.
>
> Let¡¯s see if help to set PALPP online.
> Alternatively, I can do this manually.
>
>
> I¡¯m keeping an eye
>
> Elena
>
> On 18 Jun 2014, at 09:49, Chris Brew <[log in to unmask]> wrote:
>
>> Ah, I¡¯d noticed that you were only sending SAM tests to
>> heplnv146.pp.rl.ac.uk which is playing up a little at the moment (hence
>> the downtime) could you add heplnv147.pp.rl.ac.ukto the mix as well?
>>
>> Thanks,
>> Chris.
>>
>> On 18/06/2014 09:46, "Elena Korolkova" <[log in to unmask]> wrote:
>>
>>> Hi Chris,
>>>
>>> no. This is a different issue. It¡¯s because of the DT:
>>>
>>>
>>> ATLAS Distributed Computing Site Status Board 2 <[log in to unmask]>,
>>> [log in to unmask]
>>> Switcher2 AutoExclusion] Summary for UK cloud at 2014-06-16 09:00 UTC
>>>
>>> Dear UK Cloud Support,
>>> please note that following PanDA Site IDs have been excluded/recovered
>>> by the AutoExclusion tool:
>>>
>>>
>>>
>>> Site UKI-SOUTHGRID-RALPP:
>>>
>>> SiteID: ANALY_RALPP_SL6
>>> Status changed to 'setoffline'
>>> Reason:
>>> UNSCHEDULED downtime
>>> Endpoint:heplnv146.pp.rl.ac.uk:2811
>>> Description:Investigating problems with htcondor on the node
>>> From:2014-06-16 09:00:00
>>> To:2014-06-20 07:00:00
>>> https://goc.egi.eu/portal/index.php?Page_Type=Downtime&id=14628
>>>
>>>
>>> SiteID: UKI-SOUTHGRID-RALPP_SL6
>>> Status changed to 'setoffline'
>>> Reason:
>>> UNSCHEDULED downtime
>>> Endpoint:heplnv146.pp.rl.ac.uk:2811
>>> Description:Investigating problems with htcondor on the node
>>> From:2014-06-16 09:00:00
>>> To:2014-06-20 07:00:00
>>> https://goc.egi.eu/portal/index.php?Page_Type=Downtime&id=14628
>>>
>>>
>>>
>>> PanDA Site ID recovery information:
>>> Recovery will be performed by the HammerCloud PFT/AFT. Once recovery
>>> notification is sent you may check progress of test jobs at [1] (PFT) or
>>> [2] (AFT). Summary of HC exclusion/recovery actions is at [3].
>>>
>>> [1]
>>> http://panda.cern.ch/server/pandamon/query?jobsummary=cloud&select=ptest&h
>>> ours=4&processingType=gangarobot-pft
>>> [2]
>>> http://panda.cern.ch/server/pandamon/query?jobsummary=cloud&select=ptest&h
>>> ours=3&processingType=gangarobot&dash=analysis
>>> [3] http://hammercloud.cern.ch/hc/app/atlas/robot/incidents/
>>>
>>>
>>> Somehow it¡¯s the only ce in RALPP queue configuration.
>>>
>>> I can add additional de¡¯s. Let me know which.
>>>
>>> Thanks
>>> Elena
>>>
>>> On 18 Jun 2014, at 09:25, Chris Brew <[log in to unmask]> wrote:
>>>
>>>> Hi Elena,
>>>>
>>>> The RALPP queues appears to be offline ©øset.offline.by.Switcher©÷ is that
>>>> part of the same issue or is there another problem I©öm not aware of?
>>>>
>>>> Thanks,
>>>> Chris.
>>>>
>>>> On 18/06/2014 09:01, "Elena Korolkova" <[log in to unmask]>
>>>> wrote:
>>>>
>>>>> Morning,
>>>>>
>>>>> the problem which I reported at yesterday meeting is still there.
>>>>>
>>>>> There is a low number of atlas jobs running in UK (and other) clouds.
>>>>> This is related to DDM problem (input files for jobs can©öt be
>>>>> transferred). ATLAS experts are working to solve it.
>>>>>
>>>>>
>>>>> Elena
>>>>
>>>> --
>>>> Scanned by iCritical.
>>
>
|