JISCMail - DC-ARCHITECTURE Archives

----------------------------------------------------------------------
NOTE: This text has also been posted to the new DCMI wiki as a proposed
answer to a Frequently Asked Question, 
http://wiki.dublincore.org/index.php/FAQ/DCMI_Abstract_Model.  It is 
posted here to stimulate discussion of its interpretation and of its 
conclusions.
----------------------------------------------------------------------

What is the DCMI Abstract Model, and what is its current status?

The DCMI Abstract Model (DCAM) [1] is a specification which defines an abstract
syntax for metadata records that is independent of, but mappable to, a
diversity of concrete implementation syntaxes such as HTML/XHTML, XML, and any
of the concrete syntaxes defined for RDF.  The DCAM was developed as a basis
for defining validatable metadata records, in a variety of popular
implementation syntaxes, whose contents could straightforwardly be exposed as
RDF triples. 

=== A technical summary of DCAM ===

Most of the components of DCAM's abstract syntax correspond unambiguously to
components of the RDF abstract syntax [7].  However, the DCAM and RDF were designed
for different purposes.  Whereas the purpose of RDF is to "describe reality",
the purpose of DCAM is to specify metadata record structures. To do this, DCAM 
defines several grouping constructs not covered by the RDF standards as of 2011,
notably: 

* Description Sets - the basis for "records" describing more than one resource
* Descriptions - sets of statements about a single resource
* DCAM Statements - like an RDF statement with additional context about the 
  statement's value
* Value Surrogates - alternative slots for information describing
  string values versus values identified by URIs or blank nodes

These grouping constructs are used to cluster, at various levels of
granularity, sets of syntactic slots holding the URIs and string literals used in
instance metadata -- the URIs used for identifying described resources, values,
properties, vocabulary encoding schemes, or RDF datatypes, and the literals
used as language tags or text strings -- in short, the components of a metadata
record that can be tested or validated.

The DCMI Abstract Model was designed to be used together with a constraint
language [3] for specifying the contents of application-specific metadata
record formats in a form independent of particular concrete encoding syntaxes,
as exemplified in an application profile [4].  The record formats thus
specified were intended to be implemented using concrete encoding syntaxes. Two
concrete syntaxes for the representation of DCAM description sets were defined:
an HTML/XHTML metadata profile [5] and an XML format [6]. In addition, a
mapping between the DCAM abstract syntax and the RDF abstract syntax was
defined, allowing DCAM to be used with any concrete syntax for RDF [7].

=== History of DCAM ===

The DCMI Abstract Model and its family of related applications [8] were in
active development between 2003 and 2008.  Work on the DCAM was originally
undertaken in response to a proliferation of "Dublin Core" implementations
among which interoperability was problematic due to an uncontrolled diversity
of underlying models and syntaxes, negating many of the potential advantages of
using a common vocabulary.  

RDF, standardized by W3C with a more robust second version in 2004, was
recognized by the DCAM authors as an obvious basis for their abstract model.
At the time, however, RDF was seen by a large part of the Dublin Core community
as a research project -- less as a fundamentally new way of conceptualizing
metadata than as an ordinary XML format, though one perceived as over-complex.
Instead of adopting RDF, therefore, its authors defined the DCAM project as one
of clarifying and formalizing the native-DCMI model of metadata that had
emerged from early Dublin Core workshops [9] in a form that could be aligned
with RDF over time.

By the late 2000s, the technological landscape had significantly changed.  The
idea of Linked Data, introduced in 2006 as a more accessible and focused
variant of the Semantic Web vision of 2000, had acquired significant momentum
-- a trend which validated DCAM's grounding in RDF.  However, new developments
in the mainstream Semantic Web community overlapped with some of the more
innovative aspects of the DCAM approach.  Notable examples include the notion
of a SKOS Concept Scheme in W3C's Simple Knowledge Organization System, which
provided a widely understood near-equivalent to the DCMI-specific notion of a
Vocabulary Encoding Scheme [10]; the renewed effort to clarify the semantics of
Named Graphs, which as of 2011 promises eventually to provide an RDF construct
analogous to the Description Set [11]; and the development by W3C of RDFa, a
specification for embedding RDF metadata unobtrusively into normal Web pages,
which provided an alternative to the DCAM-based HTML/XHTML encoding guidelines
[12,13].

=== Future development of DCAM ===

It is difficult to track the use of freely available specifications once they
are released on the Web, but as of 2010, the DCAM-related specifications, with
the possible exception of specific syntax guidelines, appear not to have been
widely implemented.  Rather than building a bridge from more traditional
metadata communities to the Semantic Web, the Abstract Model seems to have
fallen between two stools -- its use of the Description Set abstraction
perplexing to users more accustomed to metadata specifications defined in terms
of a concrete syntax, and its added layer of DCMI-specific terminology
confusing to users already comfortable with RDF.  

In 2010, DCMI undertook a critical review of the DCAM approach [14].
Discussion of the review at a joint meeting of the DCMI Architecture Forum and
the W3C Library Linked Data Incubator Group at DC-2010 [15] revealed a striking
lack of consensus about the meaning -- and value -- of the DCAM approach.  Some
discussants agreed with its authors in seeing the DCAM as an abstract syntax
for metadata records based on, and thus mappable to, RDF. Others, however, saw
the potential value of DCAM as a "meta-model" describing the components of
metadata descriptions at a high level of abstraction, independently of any
basis in RDF [16].

As written, the DCAM is clearly framed as the former -- i.e., as the basis for
automating the creation of validatable metadata records whose contents can
straightforwardly be exposed as RDF triples.  Attaining such degrees of
interoperability and automation implies a well-defined modeling basis, and the
authors of DCAM saw RDF as the only candidate model with any traction.

Proponents of the "DCAM as meta-model" view, in contrast, felt that the model
had value independently of RDF -- i.e., that the notion of "statements" grouped
into "descriptions" and enclosed within a "description set" are valid in the
absence of RDF's grammar of properties, classes, datatypes, and statements.
Whatever the merits of this latter view, it was clear that to define DCAM as a
meta-model independently of RDF, the base specification would need to be
extensively re-written.

As of 2011, the DCMI Abstract Model retains the status of a DCMI
Recommendation, reflecting the process of public comment periods and revisions
through which it has passed.  However, its authors have moved on to other
projects, and active development of the specification has ceased.  

[Disclaimer: the following paragraph is still under discussion by DCMI.]

In the absence of a strong set of reference implementations, the DCAM should be
viewed as a largely theoretical specification.  Regaining momentum on the DCAM
as a basis for specifying RDF-compatible metadata records would require more
resources -- authors, editors, and implementors with clear requirements -- than
DCMI can currently bring to bear.  Re-casting the DCAM as a meta-model on a
higher level of abstraction, on the other hand, would require equally as much
effort, together with a well-grounded story for the function and value of such
a meta-model in today's metadata landscape.  For now, DCMI has chosen to leave
the specifications in place -- both for their value as historical contributions
and potentially as sources of requirements for future projects -- with the
clarifications offered here.

[1] http://dublincore.org/documents/abstract-model
[2] http://wiki.dublincore.org/index.php/Glossary/DCMI_Abstract_Model
[3] http://wiki.dublincore.org/index.php/Glossary/Description_Set_Profile
[4] http://wiki.dublincore.org/index.php/Glossary/Application_Profile
[5] http://dublincore.org/documents/dc-html/ 
[6] http://dublincore.org/documents/dc-ds-xml/
[7] http://dublincore.org/documents/dc-rdf/
[8] http://wiki.dublincore.org/index.php/Glossary/DCAM_Family_of_Specifications
[9] http://wiki.dublincore.org/index.php/Glossary/Dublin_Core_Grammatical_Principles
[10] http://www.w3.org/TR/skos-reference/
[11] http://w3.org/2011/rdf-wg/
[12] http://www.w3.org/TR/rdfa-syntax/
[13] http://dublincore.org/documents/dc-html/
[14] http://wiki.dublincore.org/index.php/Review_of_DCMI_Abstract_Mode
[15] http://www.w3.org/2005/Incubator/lld/wiki/F2F_Pittsburgh
[16] http://lists.w3.org/Archives/Public/public-lld/2010Oct/0098.html