On 4 August 2010 15:54, Alessandra Forti <[log in to unmask]> wrote:
> Hi Jens,
>
> I think the sites that have experienced the SL5/XFS problem were all on
> SL5.3, but this needs further confirmation.
Not at all: Glasgow has been at SL5.5 for its SL5 disk servers since
before the problem.
(Michel was *probably* on SL5.3 when he saw his issues, though...)
>If this is true we should try to
> see what happens with more recent versions as xfs kernel support has been
> added in SL5.4 and some related bugs corrected in SL5.5.
Ah, yes, I need to email the list about that...!
>
> ext4 allegedly can have 1EB size but nobody has used it because the tool to
> create such a big file system is currently limited to 16TiB(14.5TB). On top
> of it ext4 doesn't have full 64bit support this might not be a problem but
> together with the tools limits tells me it's not yet a mature system. Michel
> seems happy with it (or at least they didn't experience any problem with
> et4) however their hardware configuration is 10x2TB raid6 with 14TB fs data
> servers so he hasn't hit this fs size dilemma. In the UK we are all going to
> buy or just bought this 36bay units with a minimum of 60TB usable space and
> the 14.5TB fs size is an annoying limit and shaves off usable space.
>
> There is an argument for keeping the fs size relatively small to implement
> some flexibility in DPM when distributing data, but I still have to
> understand better the convenience of this when the fs are on the same
> machine and also in this case I'd prefer to be able to chose the fs size.
>
Fundamentally, it's because DPM only does round-robin balancing of
transactions between filesystems.
So, if you have one 10TB disk server with 1 filesystem (=10TB) and one
30TB disk server with 1 filesystem (=30TB), then the first disk server
will fill up completely (on average) when the second is only 1/3 full.
In addition, this also means that if you have a mix of disk servers
with different io performance (and they all only have one fs on them),
then the more performant disk servers will be happy when the less
performant ones are melting (which is generally not a good use of
resource capacity).
Although, I do agree with you that ext4 (in fact, IIRC, ext* limits,
since it's the e2fsprogs tools which are currently limited to 32bit
here) limits to 16TB is unfortunately restrictive. As we also
discussed after the "official" meeting finished, there does seem to be
a 64bit version of e2fsprogs in existence, and we can test this to see
if it works for making suitably giant filesystems. (It looks, from the
linux filesystem mailing lists, as if it basically works for
filesystem creation, but there are various other e2fsprogs tools that
don't - online filesystem resize, for example).
Sam
> At the moment looking at what I wrote, and without any additional number
> that demonstrates the performance gain, I'm leaning towards installing XFS
> on SL5.5 and see if there is any improvement.
>
> cheers
> alessandra
>
>
> [log in to unmask] wrote:
>>
>> Oops, this should have gone to the list, not to me!
>>
>>
>> Minutes already uploaded! (helps that my 11 o'clock meeting was
>> cancelled) Once again a very useful and productive session, I thought.
>> Lots of good stuff.
>>
>> http://storage.esc.rl.ac.uk/weekly/20100804-minutes.txt
>>
>> New actions include me volunteering more experiments and Matt to work
>> with Sam on testing T2K at Lancaster.
>>
>> So IMHO, the agenda for storage at GridPP could look like this:
>> 16.00-16.45 Experiments presenting (including T2K, perhaps 10 mins each?)
>> 16.45-17.05 Sam reporting on AmDamJam and IC (with input from everyone
>> who went!)
>> 17.05-17.20 Me talking tasks, roadmap and stuff, we can discuss content.
>> 17.20-17.50 Discussion
>>
>> Cheers
>> --jens
>>
>
> --
> The most effective way to do it, is to do it. (Amelia Earhart)
> Northgrid Tier2 Technical Coordinator
> http://www.hep.manchester.ac.uk/computing/tier2
>
|