Hello all,
A week later then planned due to interesting times last week, here's
October's look at all the UK tickets, site-by-site - but first let's
spin off the v6 tickets and look at those.
42 Open UK Tickets this month.
IPv6 Tickets
BRISTOL: https://ggus.eu/?mode=ticket_info&ticket_id=131613
Last update 4/10. There is some recent conversation of this ticket,
servers are v6 configured but things are not quite right - perhaps with
the v6 routing?
UCL: https://ggus.eu/?mode=ticket_info&ticket_id=131604
Last update 4/10. Some positive news with hopefully some v6 addresses
rolling out soon - but that old bugbear of v6 DNS being a problem is
showing up again. Ben finishes his update asking if firewall rules
remain the same between v6 and v4.
DURHAM: https://ggus.eu/?mode=ticket_info&ticket_id=131609
Last update 10/7. Any news on this front? There seemed to be a lot of
exasperation in the last, short post back in July.
CAMBRIDGE: https://ggus.eu/?mode=ticket_info&ticket_id=131614
Last update 25/9. Some good news here, with the Cambridge perfsonars
dualstacked and added to the mesh. Duncan noticed some low throughputs
for the v6 traffic, but they are otherwise working.
BIRMINGHAM: https://ggus.eu/?mode=ticket_info&ticket_id=131612
Last update 27/8. At the last update in August Mark mentioned that their
Central I.T. was waiting on some shiny new infrastructure before they
could provide v6 DNS. Has a timescale on getting this rolled out
appeared in the last 6 weeks? Pressure definitely needs to be applied I
think.
LIVERPOOL: https://ggus.eu/?mode=ticket_info&ticket_id=131606
Last update 4/6. Steve and Co were waiting on new switches so that their
v6 performance wouldn't be terrible, plus there were internal
negotiations going on. Any news on any of this?
RHUL: https://ggus.eu/?mode=ticket_info&ticket_id=131603
Last update 10/9 (from Duncan). Any news at all on this? The perfsonar
was dual-stacked but as Duncan pointed out no v6 DNS.
OXFORD: https://ggus.eu/?mode=ticket_info&ticket_id=131615
Last update 13/7. Kashif once again mentioned v6 DNS and a blocker. Any
progress pressuring them?
GLASGOW: https://ggus.eu/?mode=ticket_info&ticket_id=131611
Last update 4/9. It's not been that long since your last update where
your move to Plan B (or is it Plan C, D, or Z?) was mentioned. Any news
in that short space of time?
RALPP: https://ggus.eu/?mode=ticket_info&ticket_id=131616
Last (proper) update 16/1. Any news here? Things seemed really positive
for a while, but that was 2 seasons ago. Really needs an update.
MANCHESTER: https://ggus.eu/?mode=ticket_info&ticket_id=131607
Last (proper) update 25/4. The perfsonars are dualstacked, but no news
on the storage (again due to v6 DNS problems IIRC). Another ticket that
really needs any update?
ECDF: https://ggus.eu/?mode=ticket_info&ticket_id=131610
Last update: 6/9. There have been some Ipv6 misadventures at ECDF, but a
lot of effort has been put into getting things working. Any luck on
getting your pool nodes dualstacked (or finding out when you'll be able
to do this)?
SHEFFIELD: https://ggus.eu/?mode=ticket_info&ticket_id=131608
Last update 10/7. Things were looking up for a while in the last update,
but I take it from the silence since things haven't made much progress?
Back to the regular tickets, site-by-site as is the tradition.
RALPP
https://ggus.eu/?mode=ticket_info&ticket_id=137633 (8/10)
A very fresh CMS ticket for transfer failures to RALPP. Assigned (8/10)
https://ggus.eu/?mode=ticket_info&ticket_id=137361 (24/9)
A t2k ticket, where a user notices that you can upload a zero-sized file
but you cannot then download it. I'm not sure why this is relevant, but
Chris can replicate with the Imperial SE and reckons it's a dcache
"feature". It might be that this will go unresolved. In progress (26/9)
GLASGOW
https://ggus.eu/?mode=ticket_info&ticket_id=134689 (23/4)
Request to upgrade the Glasgow perfsonars. With the release of 4.1
Gareth is working on it, and would like to build the new perfsonars
using the docker images. Duncan has suggested giving it a go with the
perfsonar-testpoint image to see how things go. In progress (21/9)
ECDF
https://ggus.eu/?mode=ticket_info&ticket_id=137627 (8/8)
A ROD ticket for failed SRM tests, Rob notes likely caused by some
storages on a disk server falling over. Being fixed (and indeed the
tests are working now). In progress (8/8)
DURHAM
https://ggus.eu/?mode=ticket_info&ticket_id=134687 (23/4)
The Durham request to upgrade perfsonar. Adam has put upgrading onto
their todo list in the last update. In progress (26/9)
SHEFFIELD
https://ggus.eu/?mode=ticket_info&ticket_id=137491 (1/10)
Atlas transfer failures to Sheffield. Acknowledged by the site, but have
you had any luck tackling the issue? From today's update by the DDM
shifters it looks like it's ongoing. In progress (8/10)
MANCHESTER
https://ggus.eu/?mode=ticket_info&ticket_id=137112 (11/9)
Atlas spotted that SRM space reporting at Manchester was broken. Robert
set them straight- it was due to a bug in a draining script moving data
outside of the tokens. Fixing this is a slow process, Robert estimated
of the order of weeks. On hold (20/9)
LIVERPOOL
https://ggus.eu/?mode=ticket_info&ticket_id=137458 (28/9)
Liverpool's SE not working for biomed, due to there being no space left
on the communal area. Liverpool have a spacetoken for biomed, but it was
going unused. John and Stephan helped the VO with how to query these. I
suspect this ticket can be closed soon. In progress (4/10)
LANCASTER
https://ggus.eu/?mode=ticket_info&ticket_id=136635 (9/8)
Low availability ticket for Lancaster. Being tough to get a clear 30
days due to a collection of downtimes and the Lancaster SE playing up a
bit. On hold (8/10)
QMUL
https://ggus.eu/?mode=ticket_info&ticket_id=132929 (18/1)
APEL accounting ticket for QM's slurm batch system. Lots of discussion
and a related APEL ticket
(https://ggus.eu/index.php?mode=ticket_info&ticket_id=118969). I'm not
sure if there's much further input to be had from the site for now? In
progress (12/9)
https://ggus.eu/?mode=ticket_info&ticket_id=137180 (13/9)
A t2k ticket complaining about the QM data access being slow. One of
several tickets tackling a known issue with the QMUL STORM (particularly
SRM). Dan helpfully provided a bunch of alternatives and suggestions to
help the user - so useful it should be documented! In progress (14/9)
https://ggus.eu/?mode=ticket_info&ticket_id=137631 (8/10)
A fresh ROD ticket - all SE based tests... Assigned (8/10)
https://ggus.eu/?mode=ticket_info&ticket_id=136719 (15/8)
LHCB having file access problems at QM (although I think file metadata
access problems would be more exact). The ticket mentions a database
move, did you get round to this? In progress (18/9)
https://ggus.eu/?mode=ticket_info&ticket_id=137622 (8/10)
LHCB FTS transfer problems - Dan notes that a rack had power problems
which required physical intervention. The rack is back on so hopefully
transfers will work again. In progress (8/10)
https://ggus.eu/?mode=ticket_info&ticket_id=137617 (7/10)
An atlas ticket for the same issues. In progress (8/10)
https://ggus.eu/?mode=ticket_info&ticket_id=134573 (17/4)
A request to install singularity from CMS. Dan mentioned right at the
start that this would be part of their CentOS7 move. Is this on the
horizon? On hold (17/4)
IMPERIAL
https://ggus.eu/?mode=ticket_info&ticket_id=137468 (28/9)
CMS production job stage outs from Brunel to I.C. failing. Daniela
cannot reproduce this by hand even though the problems persist for CMS.
An environment to test this out by hand has been provided so Raul could
try it out on a WN directly. In progress (5/10)
https://ggus.eu/?mode=ticket_info&ticket_id=137352 (24/9)
CMS noticed a few transfers failing - a pool node had fallen over and
then had filesystem troubles. All fixed now, so we're in the wait and
see if things go green stage. Waiting for reply (8/10)
https://ggus.eu/?mode=ticket_info&ticket_id=136687 (28/9)
Loosely related to 137468 (this ticket uncovered that issue), CMS
stageout failures at Brunel. I didn't quite follow the thread, but
diagnostics were being run over the weekend. Did they reveal anything?
In progress (5/10)
https://ggus.eu/?mode=ticket_info&ticket_id=137451 (28/9)
LHCB data transfer problems at Brunel. A lack of information had made
Raul's job debugging this difficult. Vladimir responded with something
that could help a bit today. In progress (8/10)
https://ggus.eu/?mode=ticket_info&ticket_id=133956 (9/3)
CMS xroot config change ticket. In July a multi-point plan was laid out,
how goes it? In progress (3/7)
100IT have a ticket: https://ggus.eu/?mode=ticket_info&ticket_id=137306
Orphaned ticket: https://ggus.eu/?mode=ticket_info&ticket_id=136687
I think this ticket regarding third party http transfers and the FTS can
be closed, I'm not sure anyone's looking at it.
THE TIER 1
https://ggus.eu/?mode=ticket_info&ticket_id=137195 (14/9)
A ROD ticket due to bdii problems causing SRM test failures. The issues
are known about and being worked through, but at last check on Friday
the problems persist. In progress (5/10)
https://ggus.eu/?mode=ticket_info&ticket_id=137391 (25/9)
Atlas seeing poor transfer efficiency to tape and disk at RAL. Tim
narrowed the errors down to a pair of sources. One of the sources
(TRIUMF) has spotted the cause at their side (a v6 networking issue I
failed to fully understand), it may or may not be a similar problem for
EELA-UTFSM. In progress (5/10)
https://ggus.eu/?mode=ticket_info&ticket_id=136701 (14/8)
LHCB noticing a high (5%) background failure rate for jobs at RAL. The
theory is a network issue or a problem with Castor. Waiting on the
submitter to get back from his hols. Waiting for reply (24/9)
https://ggus.eu/?mode=ticket_info&ticket_id=136199 (18/7)
LHCB stuck FTS transfers. There's been a long break in looking at this,
waiting on Catalin to get back from a well-earned break. On hold (1/10)
https://ggus.eu/?mode=ticket_info&ticket_id=137153 (12/9)
A t2k ticket about 0-sized files, asking how to deal with them in the
LFC (where they seem to have a bunch of these). It appears to be
unrelated to the recent LFC issues. There's some discussion at the Tier
1 about what to do. In progress (25/9)
https://ggus.eu/?mode=ticket_info&ticket_id=137634 (8/10)
A fresh CMS ticket regarding some transfer failures to METU (although
the error could be at either end). In progress (8/10)
https://ggus.eu/?mode=ticket_info&ticket_id=124876 (7/11/16)
The old ECHO gridftp test ticket - it's looking now like a simple
authorisation problem for the test's robot certificate - so maybe we're
nearly there fixing this. In progress (8/10)
And that's all the tickets! I'm looking forward to a bright future where
we don't have quite so many open.
Cheers all!
Matt
########################################################################
To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1
|