Eric,
Thanks again for the informative responses. I've extracted out one key
phrase of your text below to try to focus back the discussion on what I
see as the key issues: the usefulness of the applications profile
proposal put forth by Rachel [1].
Rachel proposes a rather simple abstraction (and my love of simple is
well known [2]!) that permits communities to define "record formats"
that mix metadata elements from various namespaces. As pointed out by
Jane Hunter one technology, xml-schema, is well-suited for doing this.
The examples given in Rachel's paper are all oriented towards a very
flat structure; e.g. records where elements from varoius namespaces
co-exist as in the following fictional description of a book:
<dc:creator>Carl Lagoze</dc:creator>
<dc:title>Metadata and me</dc:creator>
<foo:pageCount>10</foo:pageCount>
<bar:font>timesRoman</bar:font>
My point has been that the nice jigsaw puzzle characteristics (just snap
together the various pieces) are a nice ideal but will not be possible
with intermixing of elements from several metadata vocabularies. For
example one vocabulary may have an element that is "similar" to a DC
element but has greater strucuture, or there may elements in some
vocabularies that have semantics that overlap multiple DC elements, etc.
The result will be metata records where various semantic entities are
repeated are even obscured.
You have demonstrated, and I agree, that tools exist for manipulating
and interpretting such more complicated metadata intermixing. However,
and this is the fundamental question from my point of view, what is the
tradeoff in terms of interoperability?
A major goal ultimately of all of our metadata work in DCMI, I think, is
to promote some level of cross domain, cross application
interoperability. Restricting the discussion to resource discovery,
that translates to using metadata from various providers to build
indexes that users can search on. If DCES is indeed an acceptable set
of elements for cross domain discovery, then we want to make it as
simple as possible for clients (e.g., consuming services) to get a "dc
record" for a resource (item, collection, whatever) and add that
information to some query-able index so users can search on these
semantic buckets.
The more difficult we make it for clients and services to understand the
record format, by creating a more complex record format, the more we
interfere with interoperability across communities (as I see the
ultimate goal and advantage of our work with DCES). Looking back over a
series of DCMI discussions we can characterize increasing complexity of
records as follows:
1) A mixture of simple unqualified DC elements; the client simply
indexes the tokens of the element values
2) A mixture of qualified and unqualified DC elements; the client must
dumb-down community specific qualifiers to accomodate cross-community
interoperability
3) Rachel's flat examples of flat application profiles; the client must
pick out the dc elements from the community specific elements.
4) More complex intermixing at various tree-depth levels of elements
(which will inevitably occur as I have tried to point out); the client
must use a parser (e.g., SAX based) to pick out the DC elements
5) Various higher levels of complexity; the client must use
transformation technologies such as XSLT or procedural methods to
extract a basic "dc record")
The more complex we get on the scale, the more it is necessary for the
client to have some auxiliary information (e.g., schema information) to
perform the ultimate task of extracting what it wants.
In my original comment on the applications profile paper [3] I offered
an alternative to the "jigsaw puzzle" that entails thinking of DC record
as a rather pure projection or view of a more complex description. Such
a model distinguishes between the actual descriptions stored by
providers, which will ultimately be more complex than those that can be
formed with qualifying of DC elements or even with the types of
applications profiles proposed by Rachel, ,and the views available by
clients.
This is the type of thing we are trying to do in the Open Archives
initiative [4]. In this model we have a harvesting protocol that
permits a dialog between a client and server such as:
[client] tell me what metadata vocabularies you support?
[service] Dublin Core, FGDC, MARC
[client] show me the Dublin Core record for document xxx
[service] <dcRecord>
Under the covers the server may do all sorts of transformations from
internal descriptive models to the dc record; the client is relieved of
any burden and can consume simple dc records. Such a model similar
allows services to support individual community needs by projecting
other "metadata records" that conform to community requirements.
In closing, I find that the discussion returns to the issue of whether
DCES should be thought of as the foundation for native descriptions or
as a projection and interchange format to facilitate cross-domain
discovery. As I say here and in [2], increasing complexity (e.g.
intermixing dc elements with others in various ways in so-called
applications profiles or qualifying with highly structurred values) will
interfere with the latter goal.
Carl
[1] http://www.mailbase.ac.uk/lists/dc-general/2000-08/0000.html
[2] http://www.mailbase.ac.uk/lists/dc-general/2000-08/0018.html
[3] http://www.mailbase.ac.uk/lists/dc-general/2000-08/0028.html
[4] http://www.openarchives.org (note that the content of these pages is
subject to revision after an Open Archives technical meeting in early
September. Of particular interest is the revisioin of the open archives
core metadata record to be conformant with DCES).
> -----Original Message-----
> From: Miller,Eric [mailto:[log in to unmask]]
> Sent: Monday, August 21, 2000 9:15 PM
> To: 'Carl Lagoze'; Miller,Eric; 'Jane Hunter'
> Cc: [log in to unmask]; [log in to unmask]
> Subject: RE: Applications profiles
>
>
>
>
>
> Hmmm... I might be jumping ahead here (Dan correct me if I'm
> wrong) but
> after talking with the XQL and Quilt people, it seems to me
> the answer is
> "sure this can be done if we know a priori the syntactic
> structure". The
> syntactic structure that the open archives group might choose
> might not be
> the same as other ommunities. Therefore integrating data from an open
> archives project with other inititatives that choose a
> different structure
> is far more difficult.
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|