On Fri, May 29, 2009 at 10:29:32AM +0100, Sybille Peters wrote:
> We would like to define a specific Dublin Core XML format (or better still:
> use an exising Dublin Core format) to be used within a digital library
> (non-Dublin Core metadata already exists in various formats from 3rd parties
> and would need to be converted). We would like the new format to use the
> latest Dublin Core specifics and not have to be changed every couple of months.
>
> Here are the questions:
>
> 1) Is it recommended to use the newer DC-DS-XML format instead of the DC-DS
Correcting:
> 1) Is it recommended to use the newer DC-DS-XML format instead of the [**2003 DC-XML**]
> format (since the DC-DS-XML supports the description set as described in the
> abstract model)? I don't really see any improvement for us. We will probably
> not be using more than one description within a description set. Also the
> XSD (http://dublincore.org/schemas/xmls/2008/09/01/dc-ds-xml/dcds.xsd) is
> very generic and does not make it possible to validate very much (the
> propertyUri, vesURI, sesUri etc. are all specified as xs:anyURI). In the
> DC-XML format the properties and encoding schemes are more or less specified
> in the XSD (e.g. DCMIType). Will the DC-DS-XML schema be further specified?
> Am I misssing something?
>
> 2) Can you point me to more practical examples that implement Dublin Core in
> XML?
>
> 3) I don't understand the concept of a description set containing one or
> more descriptions. What is the joint context of these desciptions? Can you
> point me to use cases for this?
Hi Sybille,
Thank you for your questions [9]!
I think Andy answered the third question well [10]. In effect,
metadata "records" have traditionally contained descriptions of
more than one thing -- an author and a book, a manifestation and
a work. The notion of a "description set" simply makes this
notion explicit and separates the description of the book from
the description of an author in a machine-actionable way -- and
within the context of a single metadata record (i.e.,
description set).
For questions 1 and 2, the more general point is that we are
moving from a world in which metadata records have been managed
within specific, known contexts (e.g., a database or catalog) to
a "Web world" where the data from your system needs to be
exported to, linked up with, or integrated with data from many
other sources. The W3C defined the RDF model to provide a
generic form for data so that it can be easily integrated with
other data based on that form, and this has provided the basis
for the movement known as "Linked Data" (http://linkeddata.org).
In the Web world, the specific format of your data matters less
than the convertibility (or not) from that format into the
common generic form, RDF. XML formats such as RDF/XML are
designed for the serialization of RDF data. It is also possible
to use other XML formats in association with transformation
algorithms (GRDDL) in order to express the data in RDF, though
where such algorithms are retro-fitted to existing XML formats,
that process may be messy or lossy.
Specifications like DC-DS-XML are designed to be transformable
into RDF cleanly and automatically. Indeed, a major motive
for specifying the DCMI Abstract Model has been to help design
metadata records with well-defined mappings to RDF and whose
contents can therefore be straightforwardly merged into a
landscape of Linked Data. In terms of the diagram in
"Interoperability Levels for Dublin Core Metadata" [1], data
that is expressible as Linked Data is interoperable at "Level
2".
To evaluate the XML format options in this framework:
-- Most formats that use Dublin Core in XML, such as the
oai_dc XML format defined by the OAI-PMH specification
[2], most uses of "Dublin Core elements" as extension
elements in the Atom Syndication Format [4], and most of
the formats based on the 2003 DCMI guidelines [5] is
interoperable "on level 1" -- i.e., it is represented
using an XML format with no well-defined mapping to the
RDF model, hence not easily exposable as Linked Data.
-- For Level 2 interoperability, any syntax for serializing
RDF (e.g., RDF/XML) can be used. Alternatively, an
application-specific XML format can be defined with a
mapping to the RDF model, ideally using the W3C GRDDL
specification [5].
The DC-DS-XML specification can be used to make just such a
format, as it does have a well-defined mapping to RDF, which
is made available in a machine-actionable form as a GRDDL
Namespace Transformation [6]. In addition to supporting
multiple descriptions, DC-DS-XML distinguishes URIs used as
resource identifiers from other text strings used as
literals, and it distinguishes between vocabulary encoding
schemes and syntax encoding schemes (data types). The
DC-DS-XML specification is currently being finalized as a
DCMI Recommendation (only minor changes are expected, e.g.,
in attribute names). DC-DS-XML is intended to be easily
usable with technologies like XPath and not to be tied to any
single XML schema technology. Like the description set model
of the DCMI Abstract Model, it is not limited to the use of
any specific set of "terms", and that is reflected in the
dcds.xsd W3C XML Schema that you point to above - as you say,
the attribute values are typed to be xsd:anyURI. With its
additional support for the description set model, DC-DS-XML
corresponds to Level 3 in the Interoperability Levels.
So to return to your specific question "Is it recommended to use
the newer DC-DS-XML format?", it depends what you want to
achieve. If one has no immediate requirement for Linked Data
compatibility, the 2003 DC-XML specification may serve the
purpose at hand. However, if convertibility of the data might
become important in the future, one could plan for this by
designing an XML format with a GRDDL transform or by using
DC-DS-XML.
Pete Johnston is preparing a more detailed technical response
with examples of formats at different interoperability levels.
Tom
[1] http://dublincore.org/documents/2009/05/01/interoperability-levels/
[2] http://www.openarchives.org/OAI/openarchivesprotocol.html#dublincore
[3] http://www.ietf.org/rfc/rfc4287.txt
[4] http://dublincore.org/documents/2003/04/02/dc-xml-guidelines/
[5] http://www.w3.org/TR/grddl/
[6] http://purl.org/dc/transform/dc-ds-xml-20080901-grddl/dcds2rdfxml.xsl
[7] http://iesr.ac.uk/metadata/
[8] http://www.ukoln.ac.uk/repositories/digirep/index/SWAP
[9] http://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind0905&L=DC-GENERAL&P=8654
[10] http://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind0905&L=DC-GENERAL&P=9381
--
Thomas Baker <[log in to unmask]>
|