Hello all,
I'm afraid it's a long one!
Cheers,
Matt
37 Open UK tickets this week. It's the start of the month so it's time
for a deep review.
Anyone else not been receiving their ticket reminders? I haven't for
several Lancaster tickets.
UK
https://ggus.eu/ws/ticket_info.php?ticket=84408 (20/7)
Setting up of neurogrid.incf.org WMS & LFC. Both have been put in place,
Catalin wonders if the LFC can be tested? Waiting for reply (29/8)
https://ggus.eu/ws/ticket_info.php?ticket=80259 (14/3)
neurogrid.incf.org creation ticket. Nearly finished now. In Progress (29/8)
https://ggus.eu/ws/ticket_info.php?ticket=68853 (22/3/11)
Brian's ticket to track older DPMs in the UK. Still have Durham, Bristol
and Brunel to go at last update (but Brunel are retiring their old SE).
On Hold (30/7)
https://ggus.eu/ws/ticket_info.php?ticket=84381 (19/7)
Setting up the COMET VO. Registering in EU Ops Portal (ticket 85736), On
hold till this is done (3/9).
https://ggus.eu/ws/ticket_info.php?ticket=82492 (24/5)
Chris' ticket to change the reminder periods for the GridPP VOMS server.
Assigned to Rober Frank, On Hold during VOMS transition (28/8)
TIER 1
https://ggus.eu/ws/ticket_info.php?ticket=85438 (23/8)
atlas were seeing FTS transfer failures from RAL. Some files have been
corrupted, may have to get replacements from tape. Waiting for Reply (31/8)
https://ggus.eu/ws/ticket_info.php?ticket=85077 (13/8)
Biomed were seeing their nagios tests fail to register files at RAL, but
looks to be a (peculiar) problem with their SAM jobs. Other units are
involved. In Progress (3/9).
https://ggus.eu/ws/ticket_info.php?ticket=85023 (9/8)
SNO+ having troubles with one of the RAL WMSi. No reply after request to
attempt job submission to lcgwms02. Waiting for Reply (10/8)
https://ggus.eu/ws/ticket_info.php?ticket=84492 (24/7)
SNO+ having job-matching problems at RAL. Some odd behaviour, but In
Progress (31/8)
GLITE 3.1 Upgrade tickets (14/8):
https://ggus.eu/ws/ticket_info.php?ticket=85189 (UCL) In Progress (29/8)
https://ggus.eu/ws/ticket_info.php?ticket=85185 (CAMBRIDGE) In Progress
(29/8)
https://ggus.eu/ws/ticket_info.php?ticket=85183 (GLASGOW) On hold (14/8)
https://ggus.eu/ws/ticket_info.php?ticket=85181 (DURHAM) In Progress (On
hold?) (14/8)
https://ggus.eu/ws/ticket_info.php?ticket=85179 (Brunel) In Progress (22/8)
UK/SAM/GOCDB
https://ggus.eu/ws/ticket_info.php?ticket=85449 (23/8)
Bristol canceled an ongoing downtime but weren't bought out of it by
the system, thus penalising them. Winnie is out to find the cause of the
problem, and get back the lost uptime. Reset to "In Progress" after some
ticket tennis (3/9)
PHENO/BRUNEL
https://ggus.eu/ws/ticket_info.php?ticket=85011 (28/8)
Pheno seem to be surprised that they have data on the retiring Brunel
SE. In Progress (28/8)
SUSSEX
https://ggus.eu/ws/ticket_info.php?ticket=81784 (1/5)
The Sussex Certification Chronicle. Jeremy wants to push getting Sussex
out of downtime this week to avoid having to re-certify. In Progress (3/9)
UCL
https://ggus.eu/ws/ticket_info.php?ticket=85467 (24/8)
Atlas transfer errors to UCL. Clock skew on the head node took some of
the blame, but seeing more failures with "Error reading token data
header" messages.In Progress (30/8)
https://ggus.eu/ws/ticket_info.php?ticket=85549 (28/8)
Last of the User DN accounting tickets (the last child of 85547). In
Progress (28/8)
DURHAM
https://ggus.eu/ws/ticket_info.php?ticket=85679 (31/8)
se01 failing Ops tests.
https://ggus.eu/ws/ticket_info.php?ticket=85731 (3/9)
ce01 failing APEL Pub tests.
https://ggus.eu/ws/ticket_info.php?ticket=84123 (11/7)
atlas production failures. On hold as Mike expects slow progress (3/9).
https://ggus.eu/ws/ticket_info.php?ticket=83950 (7/7)
lhcb cvmfs errors. On hold (7/8)
https://ggus.eu/ws/ticket_info.php?ticket=68859 (22/3/11)
SE Upgrade ticket. Probably should be On Hold. (28/8).
https://ggus.eu/ws/ticket_info.php?ticket=75488 (19/10/2011)
CompChem job failures at Durham. On hold due to the other problems, but
once out of the woods worth checking that the problem persists. (8/8).
GLASGOW
https://ggus.eu/ws/ticket_info.php?ticket=85025 (9/8)
SNO+ were having problems with one of the Glasgow WMSs (twinned ticket
to 85023). Stuart asked for the FQAN used for the jobs as the problems
seemed voms related, but no news since. Waiting for Reply (10/8)
https://ggus.eu/ws/ticket_info.php?ticket=83283 (14/6)
LHCB seeing high rate of job failures, likely to be caused by cvmfs.
Glasgow upgraded all their nodes to the latest cvmfs but failures are
still seen on the "high-core" nodes, correlated with high numbers of
atlas job start up. Investigation continues. In Progress (30/8)
OXFORD
https://ggus.eu/ws/ticket_info.php?ticket=85496 (25/8)
LHCB has job failures, that were not cvmfs related (they reckoned a lack
of 32-bit gcc rpms or some OS difference). Problem seemed to evaporated
though, did anything change. In progress, probably can be closed (31/8)
IC
https://ggus.eu/ws/ticket_info.php?ticket=85524 (27/8)
Hone had problems submitting jobs through the Imperial WMS' due to
"System load is too high" errors. Some magic was worked, and Hone see a
massive improvement ahd propose to close the ticket. Can be closed (31/8).
LANCASTER (to my shame)
https://ggus.eu/ws/ticket_info.php?ticket=85412 (22/8)
JobSubmit tests failing to one of Lancaster's CEs. With help from
LCG-SUPPORT tracked to a desync between ICE on the WMS & the CREAM. Best
solution is cream reinstall, which is undergoing planning. On hold (3/9)
https://ggus.eu/ws/ticket_info.php?ticket=85367 (20/8)
Lancaster's other CE isn't working well for ILC. Would like to
reinstall, but will wait until ticket 85412 is solved. On hold (3/9)
https://ggus.eu/ws/ticket_info.php?ticket=84583 (26/7)
Similarly LHCB are having problems on the same node. Lancaster is
suffering a ticket pileup. On hold (3/9)
https://ggus.eu/ws/ticket_info.php?ticket=84461 (23/7)
T2K transfers fail from RAL to Lancaster. Looks to be a networking
problem. With new routing to be put in place soon hopefully this problem
will disappear, as it has eluded understanding. On hold (3/9)
BRISTOL
https://ggus.eu/ws/ticket_info.php?ticket=85286 (17/8)
CMS transfers to Bristol failing. Winnie tracked to a maxed out
datalink. In Progress (20/8)
https://ggus.eu/ws/ticket_info.php?ticket=80155 (12/3/11)
SE upgrade ticket. Bristol are prepping for the upgrade, with a test
server. On hold (17/8)
RALPP
https://ggus.eu/ws/ticket_info.php?ticket=85019 (9/8)
ILC were having problems running jobs at RALPP. Needed a lot of
configuration work, but progress made. In Progress (23/8)
RHUL
https://ggus.eu/ws/ticket_info.php?ticket=83627 (27/6)
Biomed seeing negative published space. Repeat of ticket 81439. Despite
great efforts this remains so far unsolved. On hold (31/8)
No exciting tickets from the UK or solved UK tickets that I can see this
week (which seems to be very often the case which makes me suspect I'm
missing something!).
|