> -----Original Message-----
> From: GRIDPP2: Deployment and support of SRM and local storage management
> [mailto:[log in to unmask]] On Behalf Of Alessandra Forti
>
> Since Manc degraded with the upgrade we should look at the OS/DPM
> configuration.
Oxford is now EMI2 DPM on SL5 across the whole system, but we've
seen the problem in the past when we still had gLite 3.2 pool nodes
with an EMI1 head node. I think the only time we weren't aware of
the problem (which could equally well be because it didn't exist,
or that I did and we just didn't know) was when we had SL4 gLite 3.1
pool nodes.
We also have the 'fasterdata' TCP tweaks (I've copied the relevant bit
of our sysctl.conf below).
If we think this is related to the 'extra' streams, can we tweak the
BNL FTS transfers to just use one stream? I know that ought to be
less efficient under normal circumstances, but it might be interesting
to see what it does here.
Ewan
# From http://fasterdata.es.net/host-tuning/linux/
# increase TCP max buffer size setable using setsockopt()
# 16 MB with a few parallel streams is recommended for most 10G paths
# 32 MB might be needed for some very long end-to-end 10G or 40G paths
# Commented values are OS defaults, uncommented ones are fasterdata ones,
# Defaults tested between 20121018 and 20121105 to no obvious effect - Ewan
net.core.rmem_max = 16777216
#net.core.rmem_max = 131071
net.core.wmem_max = 16777216
#net.core.wmem_max = 131071
# increase Linux autotuning TCP buffer limits
# min, default, and max number of bytes to use
# (only change the 3rd value, and make it 16 MB or more)
net.ipv4.tcp_rmem = 4096 87380 16777216
#net.ipv4.tcp_rmem = 4096 87380 4194304
net.ipv4.tcp_wmem = 4096 65536 16777216
#net.ipv4.tcp_wmem = 4096 16384 4194304
# recommended to increase this for 10G NICS
net.core.netdev_max_backlog = 30000
#net.core.netdev_max_backlog = 1000
|