Terry Allen writes:
> Ah, something like
>
> <dl><dt>Author</dt><dd>Albrecht Noth</dd> ?
>
> That would not be automatically processable except by a human, as there
> are many ways it could be done. Unless you want to specify a particular
> style of markup.
I've been suggesting something like that, but with a few extensions.
I hadn't written this up yet, hoping that someone else would be
interested enough to help. I'll give a rough description now.
The premis is that metadata is fairly indistinguishable from general
data, and we know how to represent the structure of general data.
HTML provides sufficient data structuring capability, and using HTML
in the way also provides a reasonable default way of presenting the
information to the user, in case the browser cannot deal with the data
in a more appropriate way.
A record structure can be represented with a DL definition list, where
the DTs are the field names and the DDs are the corresponding values.
This is a restricted use of DLs in which multiple DTs and DDs are
strictly paired, but we could allow multiple DTs to mean these are
alternative names for the following value, and multiple DDs would
mean this is an unordered list of values for the same field. We could
also use UL and OL for values that are unordered or ordered lists.
Values could be other DL structures or lists, to provide arbitrarily
complex recursive data structures, but at some point we need primitive
values such as text strings or numbers. These primitive values can be
represented with regular text content. Another very useful kind of
value is a reference to another object located elsewhere. We
can support references using A anchors - the HREF provides the URI and
the content of the anchor provides a label for the user.
The generic data structuring described above allows us to build
different types of structures that may be presented in a way that is
understandable to humans, but how does client software know how to
interpret a particular structure? It needs to be self describing to
some extent, so we need a way to identify the type of data that
follows.
I suggest we add a TYPE tag for this purpose which preceeds any value
that has a type that is not otherwise expected. The type is analogous
to the "Scheme" as used in the description of the dublin core. So the
whole DL structure should probably be preceeded by a TYPE tag, that
will allow the client to know what fields to expect. Any value that
has a different type than expected should be preceeded by another TYPE
tag. The TYPE tag would have an HREF attribute to reference a
description of the type, or perhaps we can define a new namespace for
types with an implicit URI prefix for locating the description. I
could also imagine defining types anonymously using the content of the
type tag, but that would require a </TYPE> to terminate the
definition. The description of the type, whether referenced by URI or
immediate, might be formal or informal. A formal type description
could be much like the data structure itself, with DLs to define
name-value pairs, but the values would simply have TYPE tags to define
the expected type of the value, and no value (or maybe a default
value).
Here is an example using a DC type to describe the slides for my
presentation in the URC panel session at the WWW5 conference.
<TYPE
HREF="http://www.oclc.org:5046/oclc/research/conferences/metadata/dublin_core_report.html">
<DL>
<dt>Title</dt>
<dd>URC Simplification</dd>
<dt>Author</dt>
<dd><A HREF="http://union.ncsa.uiuc.edu/~liberte/">Daniel LaLiberte</A></dd>
<dt>Subject</dt>
<dd>URCs</dd>
<dd>metadata</dd>
<dt>Date</dt>
<dd>8 May 1996</dd>
<dt>ObjectType</dt>
<dd>Slides for panel presentation</dd>
<dt>Form</dt>
<dd><TYPE HREF="...IMT">text/html</dd>
<dt>Identifier</dt>
<dd><TYPE
HREF="...URI">http://union.ncsa.uiuc.edu/~liberte/www/URC-slides.html</dd>
<!-- The identifier could instead be an anchor on the title, but there might
be multiple identifiers -->
<dt>Relation</dt>
<dd> <DL>
<dt>In Panel Session
<dd><TYPE
HREF="...URI">http://union.ncsa.uiuc.edu/~liberte/www/URCpanel.html</dd>
</DL>
<!-- Another way of representing relations: -->
<A HREF=http://union.ncsa.uiuc.edu/~liberte/www/URCpanel.html">
URC Panel Session</A>
</dd>
</DL>
--
Daniel LaLiberte ([log in to unmask])
National Center for Supercomputing Applications
http://union.ncsa.uiuc.edu/~liberte/
|