Hi Matt, hi Sam,
Just chipping in some comments as well.
On 26/06/18 16:49, Matt Doidge wrote:
> Hi Sam,
>
> Here are my thoughts - it's a bit of a brain-dump so please excuse the
> mess. I'm pretty sure I've spouted a few of these points at people
> before.
>
> "
> I would like a future where Tier-2 Storage isn't so specialised for
> grid user use. At places like Lancaster with a shared compute cluster
> it seems odd to have storage that we cannot share due to the bar for
> entry being too high - in almost all cases it wouldn't be worth a
> non-grid user's time to figure out how to access and use the grid
> storage.
There are solutions that can cater to both (e.g. Bristol case), but that
usually means having two separate paths, one for grid and one for local
users, which makes the setup a bit harder to maintain.
>
> There are steps (dare I say strides) in the right direction, thanks to
> the widespread uptake of webdav. If we could streamline authentication
> we'd be well away.
>
> In my mind the ideal storage solution would be, in my opinion,
> something you could just mount on your compute. If I had to build an
> SE from scratch then I would seriously consider some kind of
> "known-standard" clustered filesystem with a middleware shim over the
> top for external access.
>
> Although there are ${protocol}-fs plugins that might make a
> "posix-like" experience feasible for "regular" SEs, and encouragement
> and optimisation for these should be supported (I've only tried the
> davix tools for this before with my SE, and they worked quite well if
> slow).
>
> A trouble when you're running a multi-petabyte SE is that you are
> painted into a corner, migration to a different SE solution is too
> onerous a task for a Tier-2 of any size (due to the lack of resources
> to provide "spare" disk and the smaller amount of effort). Any tools
> that could make any migration to new backends easier would be
> appreciated, or even considered necessary in the event of a loss of
> support.
> "
This is a very good point. SEs are currently "beasts" when it comes to
the baggage they bring.
Example DMLite-SE (+ HDFS plugin): The SE still has a notion of
namespace and users on top of HDFS.
Practically I do not see a reason for it as the HDFS namespace is fast
to query & all it would need is an authentication layer and base path on
HDFS to do the same thing.
Having a lot of state gives rise to two problems:
- bugs can cause file systems to diverge: i.e. the SEs database
mismatches what is actually on the FS
- switching to a different solution is near impossible (unless the
different solution has an easy way to map permissions and interfaces
well with the FS)
I would really love to see a light SE that just does authentication and
forwards the rest to the underlying FS.
If the FS provides HTTP access (e.g. WebHDFS [1]), then the SE could be
as little as a web proxy forwarding (authenticated) requests.
Cheers,
Luke
[1]
https://hadoop.apache.org/docs/r2.4.1/hadoop-hdfs-httpfs/index.html
https://blogs.oracle.com/datawarehousing/data-loading-into-hdfs-part1
>
> I'm happy to clean up or clarify any of these thoughts, but I figured
> it better to get it out of the door rather then spend too long
> polishing stuff and never sending anything.
>
> Cheers,
> Matt
>
> On 20/06/18 14:47, Sam Skipsey wrote:
>> That would require me to have written them, Jens :)
>>
>> I will write up some positions this week!
>>
>> On Wed, Jun 20, 2018 at 2:12 PM Jensen, Jens (STFC,RAL,SC)
>> <[log in to unmask] <mailto:[log in to unmask]>> wrote:
>>
>> Thanks, Sam. Is there a means for people to preview your slides,
>> so they
>> can disagree with them? :-)
>>
>>
>> On 20/06/2018 10:58, Sam Skipsey wrote:
>> > Hello everyone in GridPP Storage,
>> >
>> > I was going to bring this up in the Storage Group meeting
>> today, but
>> > as it was cancelled...
>> >
>> > The UK Storage Group talk I have for CHEP 2018 is "Caching
>> > technologies for Tier-2 sites: a UK perspective."
>> >
>> > I have some perspectives on this - and I'm going to be using a
>> slide
>> > to trail Teng's work at ECDF on internal Xrootd Proxy Caches for
>> ATLAS
>> > work - but I definitely want to reflect the UK Tier-2
>> perspectives as
>> > a whole.
>> > So, I'd appreciate any thoughts you have on this topic, and I
>> promise
>> > they'll be reflected [especially if they disagree with my
>> perspective!]
>> >
>> > Sam
>> >
>> >
>> ------------------------------------------------------------------------
>> >
>> > To unsubscribe from the GRIDPP-STORAGE list, click the following
>> link:
>> >
>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=GRIDPP-STORAGE&A=1
>> >
>>
>> ########################################################################
>>
>> To unsubscribe from the GRIDPP-STORAGE list, click the following
>> link:
>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=GRIDPP-STORAGE&A=1
>>
>>
>> ------------------------------------------------------------------------
>>
>> To unsubscribe from the GRIDPP-STORAGE list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=GRIDPP-STORAGE&A=1
>>
>
> ########################################################################
>
> To unsubscribe from the GRIDPP-STORAGE list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=GRIDPP-STORAGE&A=1
########################################################################
To unsubscribe from the GRIDPP-STORAGE list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=GRIDPP-STORAGE&A=1
|