Dear All
The mail below may be of direct or indirect interest to you. If you have
any specific comments you would like passed back please let me know - or
feel free to enter the debate directly!
Cheers,
Jeremy
-----Original Message-----
From: Jeff Templon [mailto:[log in to unmask]]
Sent: 02 December 2005 18:56
To: Fabio HERNANDEZ
Cc: project-lcg-gdb (LCG - Grid Deployment Board);
[log in to unmask]; [log in to unmask]; Girard Pierre; Markus
Schulz; Peter Kunszt
Subject: Re: Writing files to disk / tape
Hi Fabio,
[ ps Markus in CC since I hear he might be the right guy to alert the
FTS people about all this ... ]
I think this sounds good. The only thing that I would like to make
clear is that I would like at least for something to happen FAST, more
or less NOW, in order to get us started, and then give us some breathing
room to come up with a more complete solution. To be clear, what I want
to happen more or less now is this:
a) sites publish proper info for SEs according to agreed definitions
b) lcg-xx (and FTS equivs) allow the --requires mechanism
This I would like NOW, in order to assist experiments and not make it
necessary for them to find workarounds that will embed themselves deep
in their code and cause them and us grief later.
So as long as you think that your time scale is compatible with that,
it's OK with me. I was even contemplating making a special trip to CERN
before Christmas if others are available to work on this.
To give more information, here is what I am thinking about, a bit more
fleshed out, in response to a question I got on it last night. Note the
clear split between short-term solutions and longer-term solutions.
==============================
The question was "what do we do for the following three types of SE?"
> (i) MSS - access for both production & general users
this would be in the info system with GlueSEArchitecture=tape
one would write to it via something like
lcg-cr --requires "GlueSEArchitecture==tape && GlueSEUniqueID ~
/nikhef.nl$/" [ args ]
which means execute the lcg-cr command on a nikhef.nl SE that has a tape
backend.
> (ii) disk-type SE with access of just production team
this would be in the info system with GlueSEArchitecture=disk
one would write to it via something like
lcg-cr --requires "GlueSEArchitecture==disk && GlueSEUniqueID ~
/nikhef.nl$/" [ args ]
which means execute the lcg-cr command on a nikhef.nl SE that has a disk
backend. I don't know, in the short term, how to satisfy the
requirement about production managers in a neat way, but in the medium
term one would use the GlueSAAccessControlBaseRule with a VOMS
prod-manager access control rule.
> (iii) general user disk-type SE - with the ability of quotas :-)
This would go along the same lines as (ii) for the short term, for the
longer term the VOMS rule for the Storage Area (GlueSA) would match the
entire VO. Quota handling should be handled by negotiation via the
transfer tools -- a connection is initiated, once the VOMS cert is sent
the quota for the corresponding group/role/user can be checked and can
tell the tool whether to start the transfer, or to return an error with
insufficient space. Again this tool-based negotiation might have to
wait for the longer term.
It's not a perfect solution but it has the potential to put us on the
right path quickly and as well to start actually doing something with
the Glue info, to enable the Glue team (Laurence, Stephen & co) to
figure out whether what is now defined is actually the right stuff or
not.
Oh yeah, I almost forgot, if you want to make sure that the file can
stay pinned on disk for some amount of time, you can put a requirement
on the MaxPinDuration field; this is a hack because it's not really
meant for that, but at least it's better than emailing sysadmins and
asking them to create special directories everywhere that your team will
have to remember. I suspect we really need a "defaultPinDuration" field
that means unless the software has specified to the SE that you need a
different pin duration (impossible right now with the current tools if I
am right) then you get the default one.
Also I see that there is a Quota field in the GlueSAPolicy block so this
can be used to publish the quotas, you could even use the info system to
make sure you can write by saying
GlueSAPolicyQuota > GlueSeStateUsedSpace + size(file) &&
GlueSEStateAvailableSpace > size(file)
in the requirements. This is just for the short term, this logic and
checking should be built into the file transfer tools ASAF.
ps glue schema current defs:
http://infnforge.cnaf.infn.it/glueinfomodel/uploads/Spec/GLUEInforModel_
1_2_lastcall.pdf
Fabio HERNANDEZ wrote:
> Jeff,
>
> you address an issue that is extremely important and I would like to
> comment.
>
> During the last LCG/EGEE/OSG operations workshop in Culham, it was
> agreed to identify metrics for several services, in particular the
> storage element backed by dCache. Pierre Girard (in copy) took this
> action and has been working on it with our local dCache expert.
>
> They realised that dCache would be a good candidate for identifying
the
> key elements of a complex storage element and from this experience
> abstract what information such a component should publish though the
> site information system and how the tools interacting with the storage
> elements could use that information. I think this work may be of
> interested for all sites running MSS behind the storage elements, and
> not only those using dCache.
>
> Your proposal is strongly related to Pierre's current work. I would
> propose that a group of experts exchange ideas on this subject and
come
> up with a proposal that could be discussed in the GDB. I see Laurence,
> Pierre, Jeff and Sophie as the core of this group. Once the initial
> proposal is ready, experiments could comment on the suitability for
> their intended way of using the storage elements.
>
> Does this sound reasonable to you all?
>
> My 2 cents.
>
>
> Jeff Templon wrote:
>
>> Hi *,
>>
>> We have been kicking this issue around here today -- the one about
how
>> to do writing to disk / tape etc 'right'. We decided that we
probably
>> have almost all the tools we need NOW ... we just need agreement on
>> what it all means and a couple small adaptations to software. If we
>> can get that,
>>
>> - the sites will know how to set things up
>> - the software will be able to use the information
>> - the experiments will be able to do what they want without major
hacks
>>
>> As far as we can tell, the experiments are interested in the
following
>> things:
>>
>> 1. being able to tell the system that a piece of data should or
should
>> not eventually be on tape
>>
>> 2. being able to tell the system that a piece of data put onto disk
>> needs to be pinned for a certain amount of time
>>
>> [ if we have missed something let us know!! ]
>>
>> The information system (this is why Laurence Field is in CC) is the
>> place to publish information about the capabilities of storage
>> systems. We should not be doing things like having
>>
>> srm-disk.nikhef.nl
>> and
>> srm-tape.nikhef.nl
>>
>> but having srm1.nikhef.nl publish the Glue info that indicates there
>> is no MSS behind it, and srm2.nikhef.nl publish that there IS tape
>> behind it. Furthermore there are all sorts of Glue info about things
>> like default pinning times, minimum file size (important for tape!!),
>> etc.
>>
>> If we can identify a few crucial pieces of information for SEs and
>> give clear, meaningful definitions to these, and this is followed by
>> rollout onto the prod system, we lack only one thing (and this is why
>> Sophie is in CC): the software. I think someone should think
>> carefully about the long term solution, but in the short term I think
>> this will do it:
>>
>> lcg-xx --requires "attr op val" [ args ]
>>
>> for example to make sure that a file goes to a disk-type SE and not
>> tape-backed, I imagine that we would do this:
>>
>> lcg-cr --requires "GlueSEArchitecture == disk" [ args ]
>>
>> note I am not sure that SEArch is the right thing, but this is part
of
>> the work of the Glue task forces that Laurence is heavily involved
>> with. It would help his team enormously to have some real practical
>> use cases with which to help refine the Glue definitions.
>>
>> I think we could get this all together in a matter of a month if the
>> priority is high enough. And I think it is ... otherwise the hacks
>> will continue to proliferate.
>>
>> What do others think??
>>
>> JT
>
>
>
|