Hello all, I thought I'd keep you up to date with our current
misadventures after dualstacking or SE yesterday. (also I apologise for
the cross-post between Storage and cloud-support, but it seemed relevant
to both).
At midday yesterday I dual stacked our DPM, and we've had some fun and
games since then. At first I thought everything was okay, my manual
tests ran fine and I couldn't see any failures in DDM - but unbeknownst
to me I had queried DDM incorrectly and we had a 100% failure rate all
with the error:
CGSI-gSOAP running on fts-test01.gridpp.rl.ac.uk reports could not open
connection to fal-pygrid-30.lancs.ac.uk:8446
I noticed this last night and tried a few tricks, even disabling the
ip6tables firewall on my headnode for a bit and rebooting the machine
with no joy. As we've had 50% of our pool nodes dual-stacked for quite
some time I'm 99% sure that the problem is on our headnode.
Some joy was achieved this morning after applying two changes to our DPM
headnode (which is running SL6):
Adding this line to /etc/hosts:
::1 localhost localhost.localdomain localhost6
localhost6.localdomain6
And after following a thread on the dpm-users-forum followed this advice
for adding stuff to /etc/gai.conf:
https://its.cern.ch/jira/browse/LCGDM-1331
And restarting srmv2.2 on our headnode.
So now we've gone from an a terrible transfer success rate to a rubbish
transfer success rate, and the number of error messages has broadened to
include the ever popular "Error reading token data header: Connection
reset by peer" and a few "SRM_INTERNAL_ERROR] Timed out" - although the
the "could not open connection" messages stay popular.
Another interesting artifact appears to be that there's little rhyme or
reason to the failures - for example we've passed some transfers from QM
(who I know are dual-stacked) but failed others.
Thanks to Brian for advice and with providing me with fts links to see
how things are going.
I'm going to have one more pass at the firewall, but I'd appreciate any
other advice if anyone has it. It's always nice to have more straws to
clutch at!
Thanks in advance,
Matt
|