On Thu, 5 Sep 1996, Terry Allen wrote:
> I think the IETF is the place to talk
> about MIME types.
Yep, that's what ietf-types is for! I think what some of us are
objecting to is bring metadata into a new IETF WG or lumbering it onto
the URN bods and slowing URNs down yet again (just as we've got regexp
sorted out!).
> But is it a MIME type for DC or WF that's wanted?
DCES is SGML encoded and so we can use text/sgml I guess. I'd say we
need to think about the relationship semantics for the containers and
packages in WF, which is related (no pun intended) to multipart/related.
> The theme of the conference was "identifying and resolving impediments to
> deployment of a Dublin Core style record for resource description," which
> I applaud.
>
> The result was the specification of "a container architecture for aggregating
> metadata objects for interchange."
A result. Warwick Framework was only one outcome from the Warwick meeting
and in some ways the least important, at least in the short term. Other
results were the SGML concrete representation for DCES, the need to embed
DCES in HTML 2.0, the need to think about some standards for sub-element
names and values and the need to document/promote what we've got so far.
Certainly the embedding of DCES in HTML 2.0 is needed badly now and I
think we've more or less reached consensus on that (hopefully!). WF is a
more long term thing in my mind.
> I have not been able to discover in what
> way this differs from the more homely "packaging A, B, and C, and saying
> that they're all somehow about X." I think that's too vague to be useful.
One impediment to the original Dublin Core was that it was itself a bit
vague, more or less by its very nature. WF lets us concentrate on
defining the existing 13 DC elements and define what we expect to see in
them (and how to encode it!) by removing the need to consider extending
the set to cover other types of metadata. This other metadata can go
into other packages in a WF structure. It also gives us a way of
thinking about how we can use existing techniques such as MIME and SGML
to transport the packages and containers. Just like the original DC,
packages and containers are an abstract concept that we can choose to
make concrete in a number of encodings. MIME and SGML are just two
possibilties; there are likely to be many others.
> We already possess many architectures for aggregating objects; the WF
> notion of containers that contain packages, but that may have meta attached
> to them so that they would in turn have to be contained in containers,
> is no advance on any of them,
No its not supposed to be an advance on MIME or SGML or whatever. WF is
in essence an abstract concept in the same way that DC is. That fact
that MIME and SGML can encode the WF concept in a concrete way shows that
we're on the right track. WF is more intended in my mind to prevent the
DC exploding with lots of new elements and also to grandfather in
existing metadata alongside DC. A nice side effect is that one of the
concrete implementations of the WF concept is that we _can_ transport
both resources and their metadata with the relations between them shown.
That's handy I think.
> and explicitly permits infinite regress
> without practical advantage.
Seeing as MIME can encode WF's abstract concepts and MIME seems to have a
number of practical advantages (otherwise why would so many people use
it?) then I'd say that allowing an arbitrary level of nesting is a good
thing. I still remember crappy BASIC interpretters that limited nesting
levels and that sucked big time. Lets not make our metadata
transportation suck, eh?
> It's enough for me that the information
> is collocated by some process my system understands (e.g., MIME's
> multipart/related); that process doesn't need any WF containment syntax.
If it's using MIME multipart/related, it is a WF encoding. By emphasising
WF we're showing people that relating metadata to the resources and
transporting the whole lot over the network is something that's doable.
Different to just sending USMARC records on a tape or using an HTTP META
method to just get the metadata.
> But totally aside from regress and the question of whether specifying
> a containment architecture is useful (I think of this as the "wrong stick"),
> the WF makes no advance on the issue of the semantics of the "packages"
> (thus the "wrong end").
Yep, relationship semantics are missing. Now most of us have agreed upon
the container/package abstract concept and some toy concrete
representations have tested the water, the time has come to tackle this.
> I really don't care what wrapper the bibliographic data comes in (my
> application can either use it or it can't), but I need to have the content
> labelled as bibliographic data, as distinct from discursive literary reviews
> or ISO 9000 evaluations or interpretive essays (such as may be included
> in finding aids). I want to know that this multipart/related stuff,
> or that tar file, contains one kind of thing or another.
Yep.
> I don't need the trivial formalism of containers and packages;
The formalism took, ooh, about an hour to come up with and I think its
useful because some people dealing with metadata _don't_ know about
things like MIME. You and I might not need it, but other might.
> I need
> refinement of the semantics "these are somehow about X," so that when
> I search for "stuff about X" I can extract from the results "bibliographic
> data" for one process (e.g., locating and retrieving the information so as to
> show my professor that I actually found it) and "interpretive essays" for
> another (e.g., plagiarizing them for my term paper).
Exactly, that's the next stage. But we've been rather bogged down with
other things in meta2 (concrete representations of WF, embedding DCES in
HTML 2.0, defining the subelement names and values, etc, etc). So what
should we have for the relationships between the packages and
containers? Any offers?
> The issue is how to attach these semantics to or encapsulate them within
> existing and deployed container mechanisms; the MIME implementation
> sketched for the WF (though a nice piece of work in itself) ignores this
> issue entirely, so far as I can see.
I was basing it on existing MIME where the relationships are pretty vague.
We definately need a way of saying what the relationship semantics are for
the multipart/related. I thought maybe a Relationship: header but Ron
came up with the idea of having a package at the start that laid out what
the relationships are between the other packages in a multipart/related
container. Shameless stealing a teeny bit of one of Ron's emails to me
on this, we were thinking of a package that contained stuff like:
<package w/ ID foo> is-bibliographic-info-for <package bar>
<package huh> is-critical-review-of <package bar>
<package bar> is-target-resource
<package baz> is-revision-history-of <package bar>
<package gleep> is-revision-history-of <whole metadata thing, which
unfortunately includes package gleep>
Alter syntax to taste (no format wars please folks!); the important thing
is that it gets the semantics across. I let Ron elaborate if he wants
to... :-)
> What am I missing? Or, how can we attach semantics about content or
> relatedness to such MIME types as multipart/*?
Any other suggestions welcome.
Tatty bye,
Jim'll
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Jon "Jim'll" Knight, Researcher, Sysop and General Dogsbody, Dept. Computer
Studies, Loughborough University of Technology, Leics., ENGLAND. LE11 3TU.
* I've found I now dream in Perl. More worryingly, I enjoy those dreams. *
|