Hi Chris,
no. This is a different issue. It’s because of the DT:
ATLAS Distributed Computing Site Status Board 2 <[log in to unmask]>, [log in to unmask]
Switcher2 AutoExclusion] Summary for UK cloud at 2014-06-16 09:00 UTC
Dear UK Cloud Support,
please note that following PanDA Site IDs have been excluded/recovered by the AutoExclusion tool:
Site UKI-SOUTHGRID-RALPP:
SiteID: ANALY_RALPP_SL6
Status changed to 'setoffline'
Reason:
UNSCHEDULED downtime
Endpoint:heplnv146.pp.rl.ac.uk:2811
Description:Investigating problems with htcondor on the node
From:2014-06-16 09:00:00
To:2014-06-20 07:00:00
https://goc.egi.eu/portal/index.php?Page_Type=Downtime&id=14628
SiteID: UKI-SOUTHGRID-RALPP_SL6
Status changed to 'setoffline'
Reason:
UNSCHEDULED downtime
Endpoint:heplnv146.pp.rl.ac.uk:2811
Description:Investigating problems with htcondor on the node
From:2014-06-16 09:00:00
To:2014-06-20 07:00:00
https://goc.egi.eu/portal/index.php?Page_Type=Downtime&id=14628
PanDA Site ID recovery information:
Recovery will be performed by the HammerCloud PFT/AFT. Once recovery notification is sent you may check progress of test jobs at [1] (PFT) or [2] (AFT). Summary of HC exclusion/recovery actions is at [3].
[1] http://panda.cern.ch/server/pandamon/query?jobsummary=cloud&select=ptest&hours=4&processingType=gangarobot-pft
[2] http://panda.cern.ch/server/pandamon/query?jobsummary=cloud&select=ptest&hours=3&processingType=gangarobot&dash=analysis
[3] http://hammercloud.cern.ch/hc/app/atlas/robot/incidents/
Somehow it’s the only ce in RALPP queue configuration.
I can add additional de’s. Let me know which.
Thanks
Elena
On 18 Jun 2014, at 09:25, Chris Brew <[log in to unmask]> wrote:
> Hi Elena,
>
> The RALPP queues appears to be offline ³set.offline.by.Switcher² is that
> part of the same issue or is there another problem I¹m not aware of?
>
> Thanks,
> Chris.
>
> On 18/06/2014 09:01, "Elena Korolkova" <[log in to unmask]> wrote:
>
>> Morning,
>>
>> the problem which I reported at yesterday meeting is still there.
>>
>> There is a low number of atlas jobs running in UK (and other) clouds.
>> This is related to DDM problem (input files for jobs can¹t be
>> transferred). ATLAS experts are working to solve it.
>>
>>
>> Elena
>
> --
> Scanned by iCritical.
|