On Thu, 27 Apr 2006 11:22:30 +0100
Kostas Georgiou <[log in to unmask]> wrote:
> On Wed, Apr 26, 2006 at 09:16:18PM +0100, Greig A Cowan wrote:
>
> > > Performance would have been a lot better if dcache didn't use 10K
> > > writes something like 256K for example is probably enough to keep
> > > the writes non random.
> > >
> > > http://savannah.cern.ch/bugs/?func=detailitem&item_id=10132
> >
> > I realise that you've had an issue with the 10K writes for quite a
> > while now. No one else (outside of GridPP) appears to have flagged
> > it as a problem.
>
> I hope that by now eveyone here agrees that parallel streams are
> slower, has anyone asked the question WHY this is the case? Maybe
> nobody else did so they aren't going to complain about the 10K block
> writes...
>
> Cheers,
> Kostas
I agree that this is a problem and I should raise it as an issue for
D-cache, but mature products like D-cache just comply with the specs
given to them and since GSIFTP compatibility was what was asked for
that's what we got. The merits of using FTP over HTTP are still a
mystery to me, their are many many small issues that too me make a
compelling case, this is an example of an implementation error, and I
have gone over this issue repeatedly also. Many admins say GSIFTP optimised
for single file transfers works, job done, and why should we change? The
problem about changing these things is their is no single killer reason
to abandon what was regarded as the standard line. The false logic in this
case is that parallel transfers of a single file are faster and there fore
all files should be transferred in parallel streams per file. this is
clearly a leap of faith with no modeling or though put into how things
rearly work.
Other poor assumptions existed including that by using multiple ports
things go faster without anyone mentioning that a port is just a TCP/IP
concept and has no basis in hardware. This is no longer an established
"truth" which did take a lot of lobbying. I cant help wondering how such
a story ever consistently reached so many of senior management.
I hope you can present your argument to people like Peter Clark, and
others at the top of GridPP management, and members of experiment
boards, as they are the people imputing the requirements for the SC4
meeting at Fermi very soon and we could keep this false consensus going
to long unless you make this clear to them that the fastest way to
transfer files irrespective of protocol is a single stream when multiple
files are to be transferred. They all know my opinion but at the moment
they don't know that anyone else agrees that file transfers should be
single stream. Consensus needs not only to be established between tech
people but also management needs to know that tech people have reached a
consensus, I cant help here as they already know my position.
Regards
Owen
|