Dear all,
Some sources of confusion with the existing DCAM have been terminological in
nature:
-- overlap of DCAM with RDF terms, some of which are "false friends" [1]
(e.g., an "RDF statement" is not the same as a "DCAM statement");
-- presence in the UML model of grouping mechanisms using "unusual" terminology
(e.g., "surrogate");
-- confusing distinction between entities representing "things in the data"
(syntactic "slots" in a data record) and "things in the world" (conceptual
things to which the information in the syntactic slots referred);
-- local identifiers for subjects and objects (in RDF terms, for blank nodes)
were out of scope of the DCAM Description Set Model per se, though present
in DC-TEXT;
-- subtle differences between the entities of DCAM's Description Set Model
and entities of the DC-TEXT language.
I have mapped the terminologies of the DCAM Description Set Model, DC-TEXT, and
RDF in a table in the wiki [2] and would like to propose a radical
simplification.
I propose to limit the entities of the DCMI Abstract Model to the following,
and to use the same names for a DC-TEXT-inspired notation (which could be
folded into the "technical" DCAM document and used for examples):
Record
DescriptionSet
Description
StatementSet - alternative suggestions welcome - this will require discussion!!
DescribedResourceURI
DescribedResourceID - new!
PropertyURI
ValueURI
ValueID - new!
ValueString - re-defined with more restrictive meaning
ValueLabel
LanguageTag
SyntaxEncodingSchemeURI
SKOSConceptSchemeURI - proposed as equivalent replacement for VocabularyEncodingSchemeURI
I propose to _drop_ the following as entities of the DCAM Description Set Model and
DC-TEXT language and to use them simply in illustrative patterns:
Value Surrogate
Non-Literal Value Surrogate
Literal Value Surrogate
Plain Value String
Typed Value String
The Descriptive Patterns (or Design Patterns) could illustrate selected
combinations of DCAM "slots":
Minimal DescriptionSet
One Description with one StatementSet and no DescribedResourceURI
DescriptionSet with multiple Descriptions
Connected via DescribedResourceURI and ValueURI
Connected via DescribedResourceID and ValueID
Statements of the "Literal Value Pattern"
PropertyURI + ValueString
PropertyURI + ValueString + LanguageTag
PropertyURI + ValueString + SyntaxEncodingSchemeURI
Statements of the "Non-Literal Value Pattern" (pick three of the following):
PropertyURI + ValueURI
PropertyURI + ValueURI + SKOSConceptSchemeURI
PropertyURI + ValueURI + SKOSConceptSchemeURI + ValueLabel
PropertyURI + ValueURI + SKOSConceptSchemeURI + ValueLabel + LanguageTag
PropertyURI + ValueURI + SKOSConceptSchemeURI + ValueLabel + SyntaxEncodingSchemeURI
I also propose to _drop_ all of the "thing-in-the-world" entities in the DCAM
Description Set Model: Described Resource, Property, Non-literal Value, Literal
Value, Vocabulary Encoding Scheme, and Syntax Encoding Scheme. The idea here
is to limit the DCAM Description Set Model to "syntactic slots" ("things in the
data") and point off to RDF, with its semantics, for the underlying theory
about the "things in the world" to which metadata refers.
I also propose to make dcam:memberOf equivalent to skos:inScheme and
dcam:VocabularyEncodingScheme equivalent to skos:ConceptScheme -- then use the
latter. We should consider using rdfs:label instead of rdf:value depending on
what the current RDF Working Group decides re: best practice guidelines [3].
Note that overall, the idea would be to keep the core model bone-simple and
push some of the things out of the current core model into "descriptive
patterns" that could be used as the basis of examples. The idea would also be
to align the DCAM Description Set Model perfectly with DC-TEXT, which would
be used for the descriptive patterns.
Also note that this proposal would mean changing the "interface" of the current
DCAM by pushing the Literal/Non-Literal Value Surrogates out of the core model
and replacing them with less formalized "descriptive patterns". See, for
contrast, Mikael's 2008 proposal for revising DCAM in a way that would maintain
its existing "syntactic" entities (while dropping the "semantic" entities in
favor of RDF) [4].
Tom
[1] http://en.wikipedia.org/wiki/False_friend
[2] http://wiki.dublincore.org/index.php/DCAM_Revision_Scratchpad#Terminology_compared
[3] http://www.w3.org/2011/rdf-wg/track/issues/27
[4] http://dublincore.org/architecturewiki/DCAM-2.0
--
Tom Baker <[log in to unmask]>
|