Hi Olivier,
> I would like to understand what was the outcome of the tests you did
> with Mona concerning the fts/dcache problem. I remember that the last
> test was to set fts to use srm rather than gsiftp third party copy. Was
> the iowait ok at that time ?
Correct, we configured the STAR-IC FTS channel to use srmCopy rather than
3rd party urlcopy. We observed a few things during the tests:
1. With FTS in urlcopy mode, a 50GB transfer from Edinburgh to IC gave a
rate of 138Mb/s. With srmCopy mode, the rate was 188Mb/s so a significant
boost in transfer rate was achieved.
2. The inter-node traffic was significantly reduced when using srmCopy
compared to urlcopy. This is because the data is transferred directly to
the pool that it will be stored on, rather than being routed to a pool via
a gridFTP door.
3. Your disk servers were still showing iowait. Mona grabbed some ganglia
screenshots taken during the transfers and I have attached them to this
mail.
* ED-dCache-DPM-IC-dCache-urlcopy.jpg
For the time period up to ~1740 an Ed dCache to IC dCache urlcopy was
taking place. After ~1740, the source SRM was the Ed DPM. You can see that
the iowait was less when DPM was used, but in both cases the inter-node
traffic was high (the Out/blue line in the network plot).
* ED-dCache-IC-dCache-srmcopy.jpg
After ~1550 we were running the 50GB FTS transfer in srmCopy mode. You can
see that iowait is still present (although slightly less than the
previous case), but the inter-node traffic is now negligible.
> I have understood that cms is starting its sc4 tests soon and will have
> to finish it by 19/june. Imperial will take part in that activity and
> since they are going to use phedex+fts I would like to defenitivly solve
> the know performance issue.
The currently deployed FTS server at RAL is version 1.4 which has a bug in
one of the channel agents such that when a channel is set to use srmCopy
there is a file descriptor leak. This bug has prevented us running all of
the FTS channels with dCache endpoints in srmCopy mode. Matt Hodges is
planning to upgrade the RAL FTS to v1.5 on Monday which will fix this bug.
Once we know that the new server can operate at the levels we have seen
during the past few months I will try running some more transfers in
srmCopy mode. It may be possible that we can schedule these tests to
coincide with (at least part of) IC's participation in the CMS SC4 work.
Cheers,
Greig
On Thu, 1 Jun 2006, Olivier van der Aa wrote:
> Hi Greig,
>
>
>
> Cheers, Olivier.
>
--
=======================================================================
Dr Greig A Cowan http://www.ph.ed.ac.uk/~gcowan1
School of Physics, University of Edinburgh, James Clerk Maxwell Building
TIER-2 STORAGE SUPPORT PAGES: http://wiki.gridpp.ac.uk/wiki/Grid_Storage
=======================================================================
|