Hi,
We've done our usual test of 1-8 jobs on our 8core nodes.
Efficiency/throughput plot available at the following URL:
http://hep.ph.liv.ac.uk/~jbland/rfio-atlas-throughput.png
This compares our tuned file stager results with this week's 4kB RFIO
results. The two main comparisons (for our site) are that RFIO is at
best just as good for 7/8 jobs per node but at lower job densities file
stager gives much better efficiency. Job throughput is always better for
file stager for some reason I'm not sure of.
Note that this was running jobs only on 6 8core nodes, to 48jobs
maximum. Efficiency is moderate and pretty flat for higher job numbers.
Our storage was lightly loaded, with ~100MB/s bandwidth usage and ~10%
cpu usage (mostly IOWAIT).
Once this data had been collected I opened the flood gates and let
another 230 jobs start up immediately on our cluster on our older single
core systems.
The load on the storage sky rocketed from a total of about 8 (1 per
pool) to 170, lots more IOWAIT on the pools and user CPU on the original
nodes dropped off rapidly. Overall efficiency was 44% but this is a mix
of old slow nodes (which are less affected by network congestion) and
fast new nodes, so the effect is masked somewhat. A rough estimate from
ganglia plots would indicate the efficiency to be more like 25% on the
8core nodes.
This was expected as, noted in the previous discussion on the analysis
methods, the RAID arrays are being hit with very high numbers of small,
random read accesses, far outstripping the capability of the disks. This
gives three possible regimes for small buffer RFIO as job numbers
increase; low job numbers give the best efficiency limited only by
network/RFIO latency, then worse efficiency when the RAID arrays hit
their IOPS limits, then even worse efficiency as any particular LAN
points saturate.
After a few hours I also tried upping the buffer from 4kB to 64kB and
later 128kB, but apart from an increase in bandwidth consumption there
seemed to be little effect on job efficiency until some rack links
started saturating.
Dropping the buffer size below 4kB may reduce the (already minimal)
bandwidth usage but it's likely to make no positive difference to the
efficiency unless there's something special that happens at 0kB.
Conclusion for us for the time being is that file stager gives better
throughput on our nodes and scales much better than RFIO.
John
--
Dr John Bland, Systems Administrator
Room 220, Oliver Lodge
Particle Physics Group, University of Liverpool
Mail: [log in to unmask]
Tel : 0151 794 2911
"I canna change the laws of physics, Captain!"
|