Ewan MacMahon wrote:
>> -----Original Message-----
>> From: Testbed Support for GridPP member institutes [mailto:TB-
>>
>> Testbed Support for GridPP member institutes
>>> [mailto:[log in to unmask]] On Behalf Of Ewan MacMahon said:
>>> The question is what to do about it, and so far we seem to have:
>>> - Don't run many of these jobs on one node,
>>> - SSDs (ha; as if),
>>> - Go back to direct rfio access to the SEs, but get it right
>>> this time.
>> More than one disk per WN?
>>
> Possible, but probably not going to happen. Retrofitting anything is a
> pain in the neck, it costs money, a lot of WNs don't have many bays
> (the Twins only have two, for example), it wouldn't help all that much.
> A
> pair would be better than one, but only a bit, and even assuming that
> you could fit more than that in a node you start getting towards needing
> raid cards, and at that point you're back to SSD money.
Is a USB stick big enough to get performance gains?
How much more expensive than SSD is RAM these days?
>
>> xrootd?
>>
> Very possible, but someone needs to pick up what Greig was doing before
> he left. For most of the UK an xrootd enabled DPM would probably be a
> fairly simple upgrade.
>
>> lustre et al?
>>
> I suspect that a well configured lustre will be good thing,
Indeed - and upgrading to lustre 1.8.1 seems to have solved the problems
we encountered with 1.8.0.1.
> but I don't
> see a way of migrating the existing DPM sites to that in time to make
> much difference; at least not for the upcoming run.
And in any case, this just shifts the issue to the disk servers.
Seeks take time on a HDD, so block IO is always going to perform better.
Though maybe having some ssd cache on the disk servers could be
advantageous.
>
> We were discussing this briefly at the storage meeting on Wednesday, and
>
> I think one upshot was that it would be good to make an effort to get
> all the DPM sites configured to use very small RFIO buffers, then
> run a hammercloud test using direct access and see what happens - AIUI a
> lot of the previous attempts were run using larger buffers (including
> the
> default) which chewed up excessive network bandwidth.
Duncan has done this for RHUL - http://londongrid.blogspot.com/ - where
he gets better performance using RFIO than he does with copy to worker
node.
Having said all this, my feeling at the moment is that if data can be
copied to tmpfs (where it is strongly hinted to stay in RAM), then it is
likely to lead to significant performance increases. The only question
is how much RAM is needed for this to be an effective strategy.
How set in stone are the data file sizes?
Chris
|