John, Henry, friends, Romans,
I'll just make a couple of points.
[sorry, I am a bit behind on tb-support and BTW I agree with Jeremy
that email should say what they are on the tin]
It is of course true that anyone can upload files to an SE, but it
will always be tagged as belonging to a VO. Thus, if an SE is taken
off the Grid, the admins should (at least in principle) contact the VO
and say "Here is a list of your files, copy them off within 60 days or
lose them". If users have uploaded data and it's not in the VO's
catalogues, well, tough. The alternative is to trawl through the DNs
of the owners, and somehow magically figure out the email address, and
then contact them, and then wait for them to get back to you, and check
that you have found the owners of all the files and and and.
DPM keeps the DN in the file metadata (Jiri just double checked for
me), but I don't think dCache does.
An SE being taken off should refuse to ingest more files; we talked
about this even in EDG but I don't think anyone ever built that.
Taking it out of the BDII will of course not work. Perhaps it can
advertise that it has 0 bytes available.
Now about filetypes. The type advertised in the schema is what you
get for files. If it says permanent, files become permanent. For SRM1,
that is.
In the SRM context, permanent means that the file does not expire. It
does not mean that the file is backed up, nor that the file will never
ever vanish from the Grid. The file can be deleted by the user. Or
by a sysadmin, obviously, sysadmins can do everything.
Volatile means that the file has a lifetime associated with it, and
the SE is allowed to delete the file if it feels like it, provided
that (1) the lifetime has expired, and (2) there aren't any pins on
the file. It may do so if it needs the space.
Durable is a compromise between the two. The file has a lifetime
associated with it, but the SE is not allowed to delete the file when
it expires. It must warn someone (supposedly a sysadmin), who is then
supposed to delete the file. Durable is an optional part of SRM2.
Note that all of this is distinct from the lifetime of the copy of the
file in the disk cache which may itself have a lifetime associated
with it, even for a permanent file.
In SRM2, it gets more complicated still, because *space* (where you
write files) also has lifetime associated with it, i.e. can be
permanent, volatile, durable. Volatile files can go into permanent
space, but not the other way round, IIRC.
And all types are optional, in SRM1 and 2. In SRM1, it is implementation
dependent whether the implementation supports volatile or permanent
or both. In SRM2, support for the space and file types is also
optional, although they are recommended to support permanent (IIRC).
The semantics in SRM2 is even more complicated when space is released
with files in it etc. Oh what fun we will have testing this.
So all of this is theory of course. Actual implementations have been
known to diverge from the spec...
Personally I'd recommend setting things to permanent and let the SEs
run out of space when they run out of space. As the default setting
for the non-adventurous sysadmins. Fewer surprises for everyone
concerned.
There is nothing in the schema to imply a quality of service. If you
are storing files in the Tier 1 dCache that writes to tape, files will
be written to tape sooner or later. That's about as good as it gets.
As Tim says elsewhere, in the absence of a getFileMetadata that will
tell you whether the file is on tape, you have to do arcane stuff to
query the tapesystem to find out whether the file is really on tape.
--jens
Gordon, JC (John) wrote on 07 October 2005 00:00:
> Henry, if you insist on thinking that all possible functionality that is
> not explicitly excluded by rules that you know about MUST be provided,
> you will go mad. But since you seem to thrive on this here are some more
> comments,
>
> John
>
>> -----Original Message-----
>> From: Testbed Support for GridPP member institutes
>> [mailto:[log in to unmask]] On Behalf Of Henry Nebrensky
>> Sent: 06 October 2005 23:02
>> To: [log in to unmask]
>> Subject: Storage Types (was Re: [TB-SUPPORT] Reminder of the
>> next UKI meeting)
>>
>> On Thu, 6 Oct 2005, Gordon, JC (John) wrote:
>> >
>> > My point is that this needs to be a wider discussion. If
>> LT2 ends up
>> > offering a service level that no-one wants it is a waste of many
>> > things including the discussion. We need to match experiment
>> > requirements with service provision. One might think this
>> would have
>> > been done pre-MoU but obviously not down to this level.
>>
>> I think the service we provide does generally match what
>> experiments require from a Tier*2*
>
> That's a contentious statement since
>
> a) experiments want different things from a Tier2
>
> b) I don't believe they have fully expressed what they want.
>
>>
>> > I think you are reading too much into 'permanent'. You can always
>> > close down or rename an SE but you have to do it together
>> with the VO.
>> > For the reasons you mention, they have to move or
>> rename/recatalogue data.
>>
>> No, because anyone can globus-url-copy a file into an SE and
>> store the TURL in a catalog. It might be an LCG catalog or
>> the VO's catalog, or it could be their own in an Excel
>> spreadsheet. Or a Post-It(TM) note.
>> So you'd have to trawl years' worth of logs and contact each
>> individual uploader:(
>
> No! Not ANYONE. Your authorisation should only allow individuals access
> based on their membership of a VO. Thus any use they make or files they
> create are the business of the VO - see the AUP. If you tell a VO to
> remove their files then they have to contact all their members. More
> likely they will only maintain catalogues belonging to the VO, not the
> sum of all private ones belonging to members.
>
> I am sure your university library allows itself to scrap books and
> remove them from its catalogue even though you might have your own
> private list of their books that you have previously referenced.
>
>>
>> The point that in principle classic SEs would need to be
>> maintained indefinitely has certainly been made in ROLLOUT
>> (somewhere close to "why do all these Tier2 sites bother with
>> an SE, anyway").
>>
>> (In practice I think we're saved by so far having few
>> end-users doing strange and random things.)
>
> If you pay attention to everyone on ROLLOUT or expect to meet all
> requirements of strange and random end-users - see my first sentence:-)
>
>>
>>
>> Separately, for an SRM "volatile" etc. refers to the attributes of an
>> *individual file*. I think using the same vocabulary in things like
>> GlueSAType and GlueSAPolicyFileLifeTime which apply to entire storage
>> systems is probably a bad idea.
>
> I think you will find that an SRM puts volatile files into volatile
> storage space, etc, and it is these storage areas that the Glue Schema
> describes. I cannot find anything that states this in a spec but I
> remember the use cases from the designing. There is a possibility that
> the specs of SRM and Glue have diverged but they are supposed have the
> same meanings.
>
> John
>
|