On 5 November 2013 14:51, D.Traynor <[log in to unmask]> wrote:
> Ben makes a good point. Horses for courses! there is rarely a one size
> fits all solution.
>
Certainly, but distributed filesystems have, in general, gotten more
"general purpose" over time.
> This is how we have it set up at QM. nfs for the home file system and
> some other shared directories. lustre for the data.
>
Although, of course, there you're implicitly admitting that you'd not
use distributed filesystems for home directories (while Mark would
like to use distributed filesystems for home directories...).
> Of cource multiple metadatas servers means more cost!
>
Sure, but not that much more - hopefully you'll still have far more
storage servers than metadata servers (otherwise you're not in a
regime where it would be sensible to have a distributed storage system
in the first place...).
Sam
> dan
>
> On 05/11/13 14:29, Ben Waugh wrote:
>> On 05/11/13 14:22, Sam Skipsey wrote:
>>> On 5 November 2013 14:15, Ben Waugh <[log in to unmask]> wrote:
>>>> I have only used Lustre as a user, not installed it myself, but
>>>> unless it
>>>> has changed drastically in the past year or two, it is not really
>>>> suitable
>>>> for a general-purpose filesystem. It is optimised for parallel access to
>>>> large data files, and is REALLY sluggish if you are accessing larger
>>>> numbers
>>>> of small files, e.g. compiling and linking large applications or even
>>>> listing directories.
>>>
>>> Yes, but that's true of the majority of distributed parallel
>>> filesystems (GPFS has the same issue).
>>
>> If it is true of the majority, are there some that don't have this
>> issue? These systems on the whole are optimised for high-performance
>> access, while for home directories and the like it would be nice to have
>> something that allows flexible provisioning, transparent to users, with
>> "good enough" performance for less demanding workloads.
>>
>>> You can improve on this by distributing the metadata more (ceph does
>>> have the advantage here, as it's designed to have distributed
>>> metadata, but Lustre 2.x also supports having "metadata clusters" to
>>> improve performance - the FAQ claims that this can provide very good
>>> performance for metadata operations on large directories).
>>
>> Cheers,
>> Ben
>>
>>>
>>> Sam
>>>
>>>>
>>>> UCL's Legion cluster still uses it as a shared workspace for parallel
>>>> jobs,
>>>> but when they tried using it for home directories it was a major
>>>> cause of
>>>> downtime as the metadata (I think) kept getting corrupted.
>>>>
>>>> Cheers
>>>> Ben
>>>>
>>>>
>>>> On 05/11/13 14:07, D.Traynor wrote:
>>>>>
>>>>> Lutre used by QM and sussex but also many central university hpc sites
>>>>> (might be an ide to check what your central hpc (or similar) service
>>>>> use).
>>>>>
>>>>> GPFS from IBM needs server and client licenses (not cheep). The other
>>>>> htc cluster at QM use it but I don't think they are too impressed by
>>>>> the
>>>>> IBM hardware.
>>>>>
>>>>> Lustre (and gpfs) need a metadata server which adds to the cost.
>>>>>
>>>>> glusterfs looks fairly simple to set up (http://www.gluster.org/)
>>>>> offers
>>>>> almost possix compliance and, i think, can even be mounted as an nfs
>>>>> volume.
>>>>>
>>>>> dan
>>>>>
>>>>> On 05/11/13 13:52, Mark Slater wrote:
>>>>>>
>>>>>> Hi Sam,
>>>>>>
>>>>>> Hmmm.... I think 3x redundancy is a bit much - we're not rolling in
>>>>>> cash
>>>>>> :) To be honest, as long as there's an equivalent 'dpm-drain'
>>>>>> command so
>>>>>> I can manually pull out storage when it's starting to give notice then
>>>>>> that would be fine. I was thinking of Lustre as I'd heard good things
>>>>>> about it but I have very little experience so wanted to check - though
>>>>>> if other people are using it and it's supported (to some degree
>>>>>> anyway!)
>>>>>> then I'm happy to go with that if people think it's worth a go :)
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>> On 05/11/13 13:45, Sam Skipsey wrote:
>>>>>>>
>>>>>>> Except that the POSIX bit of Ceph is considerably less polished than
>>>>>>> the rest of it, at the moment.
>>>>>>>
>>>>>>> Things to rule out: AFS.
>>>>>>>
>>>>>>> Things that might work: the usual suspects, in general (Lustre is
>>>>>>> fine, although doesn't automanage redistribution of storage across
>>>>>>> nodes, at present, so removing a volume is harder than adding one -
>>>>>>> but anything that *does* make removal of storage trivial also
>>>>>>> replicates / adds redundant blocks via parity, so you actually "lose
>>>>>>> space" to replicas etc).
>>>>>>>
>>>>>>> So, on that note: do you have money to overprovision storage beyond
>>>>>>> the bare needs (that is, can you afford to buy 3 times as much
>>>>>>> storage
>>>>>>> as you need, so you can run an HDFS system with the standard 3
>>>>>>> replicas of each block)?
>>>>>>>
>>>>>>> Sam
>>>>>>>
>>>>>>>
>>>>>>> On 5 November 2013 13:08, james Adams <[log in to unmask]> wrote:
>>>>>>>>
>>>>>>>> CephFS--
>>>>>>>> Scanned by iCritical.
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Dr Ben Waugh Tel. +44 (0)20 7679 7223
>>>> Computing and IT Manager Internal: 37223
>>>> Dept of Physics and Astronomy
>>>> University College London
>>>> London WC1E 6BT
>>>
>>
>
>
> --
> * Dr Daniel Traynor, Grid system administrator
> * Physics, QMUL, London, Tel +44(0)2078826560
|