On 24 February 2011 07:55, Alessandra Forti <[log in to unmask]> wrote:
>> As this seems to be of general interest, I've taken the liberty of posting
>> it here.
>
> Indeed. Sam said you already compared notes between QMUL and Glasgow. I
> meant to extend to
> atlas-uk-comp-operations this morning (added in case someone is not on both
> lists).
>
>> I've attached the output.
>
> thanks
>
>> I suspect the interesting thing from these traceroutes are the round trip
>> times.
>
> I need to look at each T1 but taking TRIUMF as an example traceroute RTTs
> are conflicting with the sonar results.
>
> Sonar tests (only large files transfers)
> =======
>
> 7 UKI-NORTHGRID-MAN-HEPUK - T2TRIUMF-LCG2CA - T1 1.78+-0.272.0+-0.010
> 4 UKI-LT2-QMULUK - T2TRIUMF-LCG2CA - T1 11.34+-1.652.0+-0.010
>
> Traceroute
> =======
> QMUL 15 hops and longer RTTs
> MAN-HEP 12 hops with shorter RTTs
> ====
>
> QMUL
>
> traceroute to srm.triumf.ca (206.12.1.10), 30 hops max, 40 byte packets
> 1 194.36.10.254 (194.36.10.254) 0.224 ms 0.209 ms 0.232 ms
> 2 ic-gsr.lmn.net.uk (194.83.102.37) 0.402 ms 0.389 ms 0.373 ms
> 3 so-1-0-0.lond-sbr1.ja.net (146.97.42.61) 3.596 ms 3.581 ms 3.654 ms
> 4 as1.lond-sbr3.ja.net (146.97.33.158) 1.934 ms 1.989 ms 1.973 ms
> 5 janet.rt1.lon.uk.geant2.net (62.40.124.197) 2.163 ms 2.196 ms 2.182
> ms
> 6 as1.rt1.ams.nl.geant2.net (62.40.112.137) 10.317 ms 10.355 ms 10.339
> ms
> 7 so-2-0-0.rt1.fra.de.geant2.net (62.40.112.9) 17.192 ms 17.178 ms
> 17.161 ms
> 8 abilene-wash-gw.rt1.fra.de.geant2.net (62.40.125.18) 200.701 ms
> 215.566 ms 216.638 ms
> 9 64.57.28.100 (64.57.28.100) 168.078 ms 167.686 ms 167.628 ms
> 10 ge-6-2-0.0.rtr.kans.net.internet2.edu (64.57.28.36) 157.350 ms 157.253
> ms 157.344 ms
> 11 xe-0-0-0.0.rtr.salt.net.internet2.edu (64.57.28.24) 181.672 ms 181.403
> ms 181.361 ms
> 12 xe-1-0-0.0.rtr.seat.net.internet2.edu (64.57.28.105) 197.073 ms
> 198.863 ms 198.846 ms
> 13 * * *
> 14 c4-bcnet.canet4.net (205.189.32.193) 224.092 ms 224.451 ms 224.518 ms
> 15 R2-TRIUMF-ORAN.BC.net (142.231.1.50) 225.601 ms 225.861 ms 226.319 ms
> 16 r4-r1.triumf.ca (142.90.92.22) 227.379 ms 228.024 ms 228.646 ms
>
>
> MAN-HEP
>
> traceroute to srm.triumf.ca (206.12.1.10), 30 hops max, 40 byte packets
> 1 195.194.104.250 (195.194.104.250) 0.244 ms 0.317 ms 0.374 ms
> 2 194.66.26.57 (194.66.26.57) 0.299 ms 0.400 ms 0.490 ms
> 3 so-1-2-0.leed-sbr1.ja.net (146.97.42.169) 1.465 ms 1.533 ms 1.570 ms
> 4 so-5-1-0.lond-sbr1.ja.net (146.97.33.98) 5.846 ms 5.840 ms 5.872 ms
> 5 as1.lond-sbr3.ja.net (146.97.33.158) 6.253 ms 6.244 ms 6.234 ms
> 6 janet.rt1.lon.uk.geant2.net (62.40.124.197) 6.343 ms 6.316 ms 6.325
> ms
> 7 as1.rt1.ams.nl.geant2.net (62.40.112.137) 14.533 ms 14.571 ms 14.631
> ms
> 8 canarie-gw.rt1.ams.nl.geant2.net (62.40.124.222) 125.955 ms 125.832
> ms 125.844 ms
> 9 clgr1rtr1.canarie.ca (205.189.32.162) 166.942 ms 166.913 ms 166.958
> ms
> 10 c4-bcnet.canet4.net (205.189.32.193) 178.089 ms 178.071 ms 178.061 ms
> 11 R2-TRIUMF-ORAN.BC.net (142.231.1.50) 179.022 ms 179.111 ms 179.314 ms
> 12 r4-r1.triumf.ca (142.90.92.22) 178.688 ms 178.685 ms 178.646 ms
>
GLASGOW
svr018:~# traceroute srm.triumf.ca
traceroute to srm.triumf.ca (206.12.1.10), 30 hops max, 40 byte packets
1 130.209.239.3 (130.209.239.3) 0.373 ms 0.393 ms 0.480 ms
2 130.209.2.233 (130.209.2.233) 0.334 ms 0.362 ms 0.403 ms
3 130.209.2.122 (130.209.2.122) 0.990 ms 1.176 ms 1.171 ms
4 glasgowpop-ge1-2-glasgowuni-ge1-1-v152.clyde.net.uk
(194.81.62.153) 1.106 ms 0.942 ms 1.073 ms
5 so-2-0-0.glas-sbr1.ja.net (146.97.40.97) 1.113 ms 1.107 ms 1.086 ms
6 ae14.warr-sbr1.ja.net (146.97.33.121) 5.720 ms 5.508 ms 5.782 ms
7 so-5-1-0.read-sbr1.ja.net (146.97.33.89) 9.656 ms 9.747 ms 9.743 ms
8 as0.lond-sbr3.ja.net (146.97.33.166) 11.130 ms 11.071 ms 11.068 ms
9 janet.rt1.lon.uk.geant2.net (62.40.124.197) 12.901 ms 12.797 ms 12.785 ms
10 as1.rt1.ams.nl.geant2.net (62.40.112.137) 19.370 ms 19.307 ms 19.348 ms
11 canarie-gw.rt1.ams.nl.geant2.net (62.40.124.222) 118.983 ms
118.980 ms 118.972 ms
12 clgr1rtr1.canarie.ca (205.189.32.162) 159.447 ms 159.434 ms 160.136 ms
13 c4-bcnet.canet4.net (205.189.32.193) 170.518 ms 170.809 ms 170.795 ms
14 * * *
15 r4-r1.triumf.ca (142.90.92.22) 171.611 ms 171.255 ms 171.644 ms
16 * * *
>
> The guys from fasterdata.es.net gave an excellent talk at hepix in San
> Francisco. I'm sure I mentioned it shortly afterward.
>
> The executive summary is that for fatter pipes and longer distances you need
> more packets in flight. The SL defaults were relatively poor for people with
> high bandwidth links (possibly specifically for SL4 kernels) and some
> algorithms were better than others (and SL4 didn't include some of the
> better ones). Packet loss also has a disproportionate effect at longer
> distances (more packets to retransmit).
>
We're a mix of SL4 and SL5 nodes at Glasgow (because I need to migrate
the SL4 storage nodes to a new partitioning scheme to minimise the
effects of DPM being incapable of handling differently-sized
filesystems in a pool.
I note that DPM's tuning turns SACK *off* on pool nodes, which is
potentially very bad for long distance transfers of large files. I
asked J-P why it does so, and he couldn't give a reason other than 'it
was what CASTOR did when we borrowed the config from it'. Notably,
CASTOR does not turn off SACK now. So, since our DPM pool nodes
*aren't* passing through our NAT, like our WNs are, I might turn SACK
back on and see if it breaks anything...
Sam
|