Hello,
Here's the ticket update for tomorrow's UK Site's meeting. It's a little
bit lighter then last week's.
Cheers,
Matt
24 Open UK Tickets this week. A couple of sites forgot to "In Progress"
their tickets after starting work on them last week , I stepped in and
interfered with them. No sign of any site's not being notified about
their tickets this week.
UK
https://ggus.eu/ws/ticket_info.php?ticket=80259 (14/4)
Creation of the neurogrid.incf.org VO. WMS & LFC request ticket
submitted (84408 below) (20/7). GGUS registration ticket submitted
(84848) (6/8)
TIER 1
https://ggus.eu/ws/ticket_info.php?ticket=84655 (30/7)
SNO+ are having trouble using the RAL WMSs (WMSIi? WMSices?). The RAL
team have been trying some config tweaks to hit the sweetspot for SNO+.
One wms is working (1/8) and waiting for reply from Matt M to see if the
other is working now. (3/8)
https://ggus.eu/ws/ticket_info.php?ticket=84408 (20/7)
Request to enable neurogrid on WMS & LFC, some delays due to holidays.
Catalin asked if the LFC is to be "local or central", Jeremy replied
that it should probably be central (needs double checking). No news on
WMS, which is the priority. (31/7)
https://ggus.eu/ws/ticket_info.php?ticket=84492 (24/7)
SNO+ Jobs were not being matched to their queue at RAL, seems to be a
problem with jobs (submitted via Ganga to the WMS) matching against
GlueHostMainMemoryVirtualSize (which was not set) rather than
GlueHostMainMemoryRAMSize. GlueHostMainMemoryVirtualSize has been set
for the queue in question now, Waiting for reply from SNO+ (27/7).
https://ggus.eu/ws/ticket_info.php?ticket=83927 (6/7)
SNO+ attempting to get FTS to work for them. This ticket has been around
the houses, before being set on on the RAL FTS (24/7). After a few
tweaks much progress seems to have been made, hopefully things will work
now, waiting for reply (26/7). Some additional advice on how to check
transfers was given to SNO+ (1/8).
QMUL
https://ggus.eu/ws/ticket_info.php?ticket=84793 (3/8)
Hone were seeing job problems (originally thought to be one of their
usual "in scheduled status for too long" tickets). Chris revealed that a
problem with the batch system was actually the culprit. Jobs are flowing
for hone again, looks like the ticket can be closed (6/8)
IC
https://ggus.eu/ws/ticket_info.php?ticket=84760 (2/8)
Hone were seeing jobs being cancelled on IC queues. I.C. suffered from
power problems followed by the CE playing up. *Should* be fixed as of
Friday, waiting for confirmation from hone (should be "Waiting for Reply
really). (3/8)
BRUNEL
https://ggus.eu/ws/ticket_info.php?ticket=84639 (30/7)
Brunel's DPM is below the WLCG recommended level. Request from Brian for
upgrade plans. (related to 68853). Raul has replied stating that the SE
is small, with only a few TB of non-LHC data on it. He confirms that it
will be upgraded before the deadline (I'm not sure what that deadline is
though). (31/7)
DURHAM
https://ggus.eu/ws/ticket_info.php?ticket=84123 (11/7)
High atlas production failure rate. Mike offlined suspect nodes (19/7)
but the UK cloud has set Durham to test mode (20/7). Power work went
"badly" (smoking PDU badly) (1/8). PDU stabilisation working was to take
place on the 2nd Aug, but no update from the site since (1/8).
The other Durham tickets can and should be put On Hold if infrastructure
problems continue (progress requires power):
https://ggus.eu/ws/ticket_info.php?ticket=83950 (7/7)
lhcb cvmfs problems. First attempts at triage failed, and recent
attempts by lhcb to confirm the problem fixed have been blocked by job
submission problems (26/7).
https://ggus.eu/ws/ticket_info.php?ticket=68859 (22/3/11)
Brian's request for DPM upgrade plans. As of 19/1 still had disk servers
to update. 30/7 Brian requested some more information.
https://ggus.eu/ws/ticket_info.php?ticket=75488 (19/10/11)
Compchem were seeing authentication problems at a number of sites,
including Durham. On Hold (19/1). On 18/7 Mark M will poke Mike to see
if the ticket can be closed.
GLASGOW
https://ggus.eu/ws/ticket_info.php?ticket=83283 (14/6)
lhcb cvmfs problems. Dave cited
https://savannah.cern.ch/bugs/index.php?95420 &
https://savannah.cern.ch/support/?129468 (18/6). Has there been any
plans to try out the newer versions of cvmfs (or does the problem even
still exist?). Jeremy set to Waiting for Reply (31/7).
SUSSEX
https://ggus.eu/ws/ticket_info.php?ticket=81784 (1/5)
It looks like this journey is nearing its end. Emyr has experienced the
joy that only "all green" site nagios pages can bring. Congratulations!
Next steps will be discussed in this week's meeting. (6/8)
SOLVED CASES
https://ggus.eu/ws/ticket_info.php?ticket=84487
SNO+ are having curl problems at Oxford (although the same command
worked at QMUL). It seems that SNO+ require very up to date SL/RHEL "CA
bundles" to get the latest DigiCert CA that SNO+ required. A mirror
problem meant that this update was missed at Oxford. If everyone has the
latest security updates installed (which we should do) then as Ewan
pointed out this problem should never be seen again. But still worth noting.
https://ggus.eu/ws/ticket_info.php?ticket=83213
Chris W ticketed ngs.ac.uk concerning the decommissioning of
ce03.esc.qmul.ac.uk. After a long while they got back to Chris saying
that there's "Nothing to do here". It appears that the ngs don't require
to be notified in such a way if they're removed from a CE as long as the
corresponding entries are removed from the Information System.
TICKETS FROM THE UK
No exciting happenings on this front that I can see.
|