Hi,
Just running some tests on our new storage after seeing very heavy load
during some transfers.
I think the main problem is that on our MegaRAID controllers write
operations can starve out reads.
Running concurrent write operations is fine, individual throughput is
lower but total throughput isn't much less than for a single thread.
Same for reads.
But running heavy write and read operations concurrently the write
operations run at nearly normal speed, while reads slow to a crawl
(orders of magnitude slower).
This happens regardless of which scheduler or scheduler settings I use,
but only happens on the RAID controller. If I run the same tests on a
local disk directly attached to the motherboard reads are affected but
still run at a reasonable throughput.
The only way of stopping this appears to be to disable the Write Back
cache on the controller, but this impacts write performance terribly.
Has anyone else seen behaviour like this or have any fixes for it? We
noticed it recently because our new storage was being filled up with 10s
of TBs of ATLAS data, causing far higher load than expected. Under
normal operations the writes are more spread out so it's not so noticeable.
We have the same controller on some VM storage where writing 10s of GBs
at a time is pretty normal and blocking a load of VM images isn't good.
Cheers,
John
ps Sounds very similar to the issue seen in this thread
http://www.spinics.net/lists/target-devel/msg03885.html
--
John Bland [log in to unmask]
Research Fellow office: 220
High Energy Physics Division tel (int): 42911
Oliver Lodge Laboratory tel (ext): +44 (0)151 794 2911
University of Liverpool http://www.liv.ac.uk/physics/hep/
"I canna change the laws of physics, Captain!"
|