Hi Jeremy,
here is the atlas report. I tried to upload it but either I'm having
difficulties to remember how to do it or you changed the settings.
cheers
alessandra
On 14/06/2011 09:39, Jeremy Coles wrote:
> Dear All
>
> The agenda for today's ops team and sites meeting is here: http://indico.cern.ch/conferenceDisplay.py?confId=126707.
>
> In addition to the standing items there will be a look at the material from the GDB last Wednesday, a discussion on job loads& flows at sites and an opportunity to get feedback on any specific issues being faced at your site.
>
> For minutes the order is: Chris=0 Ewan=0 Rob=0 Duncan=1 Alessandra=1 Stuart=1 David=1 Stephen=1 Catalin=1 Matt=1.
>
> regards,
> Jeremy
Here is the Atlas report: 7/6 - 13/6.
Sites:
======
UKI-SCOTGRID-ECDF:
- since ~ last wed At Risk for GPFS work they are also in brokeroff in atlas.
- From later this afternoon they will be in DT until thu morning
(for a power outage) will be put offline in atlas
- there was some failed HC jobs for a libglobus library missing (which was on
gpfs) so the ANALY queue was autoblacklisted - it is not clear why it is
still like that but since they are at-risk anyway and it will go back after
the power outage
Atlas:
======
* An atlas user is causing some problems particularly at Sheffield due to her
workdir being to big. Jobs die after 40h which is a waste of resources. User
has been contacted and problem investigated.
Generic
=======
* Sonar tests display an evident asymmetry with inter-cloud transfers
from T1s which is not present in intra-cloud transfers with other T2s.
This is under investigation.
* UKI-SCOTGRID-GLASGOW and UKI-NORTHGRID-MAN-HEP
have problems with FZK transfers. iperf servers have
been setup again to try to understand what is going on. Manchester will
start tests today, Glasgow is waiting for the go ahead of their networking
team.
* RAL is investigating how to improve the analysis jobs numbers that would
attract more data. A recipe that increases the number of payloads in
pilot jobs that last less than 60 minutes has been implemented and is being
tested. If it works it might be tested at T2 and rolled out to the cloud.
Site status (online/brokeroff/offline)
======================================
UKI-SCOTGRID-ECDF: currently in brokeroff will be put offline later.
Open Tickets
============
none.
|