On 23 June 2010 12:42, Christopher J.Walker <[log in to unmask]> wrote: > Stuart Purdie wrote: > >> To begin, the oblibitary pretty picture: >> >> >> http://lhcbweb.pic.es/DIRAC/LHCb-Production/visitor/systems/accountingPlots/dataOperation#ds9:_plotNames7:Qualitys9:_groupings6:Sources13:_timeSelectors2:-1s10:_startTimes10:2010-05-22s8:_endTimes10:2010-06-23s14:_OperationTypes14:putAndRegisters7:_Sources213:LCG.Barcelona.es,LCG.Bristol-HPC.uk,LCG.Bristol.uk,LCG.Brunel.uk,LCG.CNAF.it,LCG.Glasgow.uk,LCG.Liverpool.uk,LCG.PIC.es,LCG.Sheffield.uk,LCG.UKI-LT2-Brunel.uk,LCG.UKI-SCOTGRID-GLASGOW.uk,LCG.UNINA.it,LCG.UNIZAR.ess9:_typeNames13:DataOperatione >> >> (Hrm, that's a monster url: same thing at >> http://tinyurl.com/lhcbtransjune That is on fixed dates, not a rolling >> 'last month') >> >> What you're looking at is the transfer attempts + failures for LHCb >> traffic across a number of sites, for about the past month. Note that this >> is transfers, not jobs - a job can succeed after a couple of failed transfer >> attempts, so this is the most strict criterion to look at. >> >> I've included all the sites that I can see were having problems, along >> with PIC and CNAF to show the 'bad days' for comparison. >> >> The key thing to look is for Glasgow, after the 16th, when we switch from >> yellowish green (about 50%) to dark green (100% near enough). What changed >> was I tuned the TCP stack on the worker nodes. (Same thing YAIM does to DPM >> pool nodes). That resolved the problem. >> >> This the systcl parameters I set: >> # TCP buffer sizes >> net.ipv4.tcp_rmem = 131072 1048576 2097152 >> net.ipv4.tcp_wmem = 131072 1048576 2097152 >> net.ipv4.tcp_mem = 131072 1048576 2097152 > > net.core.rmem_default = 1048576 >> net.core.wmem_default = 1048576 >> net.core.rmem_max = 2097152 >> net.core.wmem_max = 2097152 > > # SACK and timestamps - turn off >> net.ipv4.tcp_dsack = 0 >> net.ipv4.tcp_sack = 0 >> net.ipv4.tcp_timestamps = 0 >> >> > Can you follow up to the list with the previous values. It isn't clear from > your mail what you increased/decreased. > > The defaults are: net.ipv4.tcp_rmem = 4096 87380 4194304 net.ipv4.tcp_wmem = 4096 16384 4194304 net.ipv4.tcp_mem = 196608 262144 393216 net.core.rmem_default = 129024 net.core.wmem_default = 129024 net.core.rmem_max = 131071 net.core.wmem_max = 131071 net.ipv4.tcp_dsack = 1 net.ipv4.tcp_sack = 1 net.ipv4.tcp_timestamps = 1 (basically, the min, starting and max values for tcp window size were all increased by ~x10 - to the YAIM tuned values for disk servers (which happen to be a good approximation to tuned for transfers to RAL and CERN) - and sack was turned off). Sam Chris > > > >> Ok, so that's the what - the why is not so clear. I was working on the >> theory that the presence of the NAT boxes represented a network >> inefficiency, and that if the transfers were given longer then they would >> compete successfully. Therefore the approach was to try to optimise the >> transfers from the worker nodes to CERN, so that if they went a bit quicker, >> they'd complete before the timeouts. Note that (at least for us), RAL is 12 >> ms away, and CERN is 27 ms away. The closer one is to CERN, the smaller >> effect this change should have (we might well be in the worst case here, at >> least until UKI-SCOTGRID-SHETLAND gets off the ground). >> By tuning the worker node for a Long Fat Network, which that sort of >> connect is, we get more data moved faster. (Although the target nodes are >> tuned, TCP/IP is limited by the congestion window on both sides, hence >> tweaking the worker nodes as well.) I've been poking at other parameters as >> well, but the parameters above worked so well that I can't find any >> differences with any others. (It's also worth noting that these made no >> difference in transfers to or from our local SE - i.e. they don't seem to >> cause any problems even if not useful.) >> >> I'd be interested if applying this sort of tuning to worker nodes will >> have any effect at the other sites that are having transfer problems - >> Brunel, Liverpool, Sheffield and Bristol. Also, I'd be interested on the >> round trip times between the worker nodes and CERN (i.e. through the NAT) - >> I've been traceroute www.cern.ch, and reading off the last one I can. >> Raja - I note that Barcelona and UNIZAR both show similar (although less >> severe) effects as the UK. Your opposite number in Spain might be >> interested in this - certainly I'm curious about their configuration: I >> rather suspect they have NAT's and untuned worker nodes. >> >> >>