Hi Chris,
I followed recommendations on this page
http://www.cyberciti.biz/faq/linux-tcp-tuning/
and doubled the net.core.rmem_max and wmem_max as I read they might
conflict with tcp_rmem/wmem if they are smaller (I could have put the
same value as suggested on this page I guess).
Said that I don't think the differences in these values can explain why
in one direction transfers are so poor compared to the opposite
direction most of the time. Just to put things into perspective my 3.5
years old laptop has 3MB max rmem/wmem values.
> I note that QMUL gets poor transfers to that host too - at least
according to the sonar tests.
Which host are you referring to? I tested 3 T1s with different
characteristics to see if I could extrapolate any information from the
differences. The things I noticed are below:
1) sender window sizes always =12 bytes in the direction with worst
transfers rates. Why? Is there something wrong with window scaling?
maybe this is a red herring and I didn't catch all the required packets,
but it's weird it happens always in one direction and not the other.
2) window size goes up to MB rate only in 1 case copying back from TW.
Again why not in any other case? what is the configuration on the other
side?
Unfortunately the machines chosen by srm are random and this makes
difficult to do systematic automatic monitoring with tcpdump. I tried to
use gridftp to make it more predictable but it has it's own mapping and
permission problems when writing directly to a disk server.
cheers
alessandra
On 26/02/11 16:37, Christopher J. Walker wrote:
> On 26/02/11 13:20, Alessandra Forti wrote:
>> Hi Chris,
>>
>> I posted them a couple of days ago but here they are
>>
>> net.ipv4.tcp_rmem = 10240 87380 12582912
>> net.ipv4.tcp_wmem = 10240 87380 12582912
>
>
> http://fasterdata.es.net/fasterdata/host-tuning/linux/
> says:
>
> # increase Linux autotuning TCP buffer limits
> # min, default, and max number of bytes to use
> # (only change the 3rd value, and make it 16 MB or more)
> net.ipv4.tcp_rmem = 4096 87380 16777216
> net.ipv4.tcp_wmem = 4096 65536 16777216
>
> So your minimum is larger than recommended - though I suspect that
> won't make much difference. Your maximum is 12MB, rather than 16MB -
> so again, it isn't clear it should make much of a difference.
>
>
>> net.core.rmem_default = 12582912
>> net.core.wmem_default = 12582912
>>
>> net.ipv4.tcp_congestion_control = bic
>>
>
> Which is what QMUL was using until this morning.
>
> Fasterdata.es.net says "For long fast paths, we highly recommend using
> cubic or htcp." but "NOTE: There seem to be bugs in both bic and
> cubic for a number of versions of the 2.6.18 kernel used by Redhat
> Enterprise Linux 5.3 - 5.5 and its variants (Centos, Scientific Linux,
> etc.) We recommend using htcp with a 2.6.18.x kernel to be safe."
>
> AIUI the difference is how they deal with packet loss.
>
> I note that QMUL gets poor transfers to that host too - at least
> according to the sonar tests. Is it worth trying a host that we get
> good transfers to and you don't?
>
> Chris
>
>
>
>>
>> On 26/02/11 12:13, Christopher J. Walker wrote:
>>> On 26/02/11 11:28, Alessandra Forti wrote:
>>>> Correction TW in the inbound test does go up to 9-10MB. I have to try
>>>> with a bigger file.
>>>>
>>>> cheers
>>>> alessandra
>>>>
>>>> On 26/02/11 11:15, Alessandra Forti wrote:
>>>>> Hi,
>>>>>
>>>>> based on the observation that Manchester is much better at receiving
>>>>> data than serving them, I've done some simple transfer tests
>>>>> between 3
>>>>> T1s back and forth using lcg-cp. Taking these 3 with different
>>>>> characteristics:
>>>>>
>>>>> *TW-FTT:* extremely slow <100KBs I had to use a smaller file to
>>>>> complete the tests, coming back the rate shoots to 600 KBs with peaks
>>>>> of 2.2MBs it would be interesting to try with a bigger file in this
>>>>> direction.
>>>>> *RAL:* average oscillating between 5MBs-11MBs both directions,
>>>>> *IN2P3-CC:* slow sending data there 1.1MBs but extremely fast getting
>>>>> them back 100MBs.
>>>>>
>>>>> I run tcpdump during the transfers and used tcptrace to analyse the
>>>>> output. I put the output of tcptrace here
>>>>> http://ks.tier2.hep.manchester.ac.uk/T2/sonar and some quick view of
>>>>> the reults below:
>>>>>
>>>>> I can probably become more systematic using a gridftp transfer tool
>>>>> and avoid SRM negotiation which also have seems to vary in length and
>>>>> also is unpredictable on which data server will return.
>>>>>
>>>>> My questions are:
>>>>>
>>>>> * Why the second column has always 12 bytes in the outbound tests?
>>>>> Shouldn't there be an appropriate resizing of windows in both
>>>>> directions as it happens when I copy back the data?
>>>>>
>>>>> * Windows are not going above few tenths of KB. In the old TCP
>>>>> protocol they cannot go above 64KB. This was changed to keep up with
>>>>> GB connections and now TCP windows can go up to 1GB size. However it
>>>>> doesn't seem the case here despite the MB put in sysctl.conf.
>>>>
>>>
>>> What settings are actually in use at the moment?
>>>
>>> Specifically, can you run:
>>>
>>> sysctl net.core.rmem_max
>>> sysctl net.core.wmem_max
>>> sysctl net.ipv4.tcp_wmem
>>> sysctl net.ipv4.tcp_wmem
>>>
>>> and also
>>> sysctl net.ipv4.tcp_congestion_control
>>>
>>> on your gridftp server.
>>>
>>> Chris
>>>
>>
>
|