On Thu, Oct 20, 2011 at 10:51:07AM +0200, Paul Groth wrote:
> I was wondering in the Dublin Core view of the world whether being a
> doman model is a good thing or not?
In the "Dublin Core view" -- i.e., the Singapore Framework  -- a Domain
Model specifies what descriptive metadata is about. It specifies the "things
in the world" that are being described and relationships between those things.
A Domain Model is an essential part of an Application Profile. Looking at the
Singapore Framework diagram , the Domain Model is informed by Functional
Requirements (to its left), and it is "decorated" with descriptive properties
and constraints (such as cardinality, value spaces, and the like) in a
Description Set Profile (to its right). A Description Set Profile, then, is a
full specification of the "content" of metadata descriptions, to be realized in
By asking whether the PROV-DM model is any different from a Domain Model, I was
wondering whether it could straightforwardly be slotted into the Singapore
Framework view and used as the basis of an Application Profile. More
precisely, I was thinking of the model as a Community Domain Model (on the
"Domain standards" level of ) -- a model which might get adapted or extended
in the Domain Model of a particular Application Profile (one level above in the
A somewhat closer examination of the Provenance drafts, however, reveals
> A note on complexity, I believe the core PROV model is fairly
> straight forward but there is complexity that helps to address a
> number of other use cases.
Generally speaking, one problem with complexity that I have seen in different
contexts is that increasing the complexity of a model also makes it more
specific (and accordingly more "brittle" to implement). Increasing specificity
can make some people very happy while turning off others who may only have
needed something simple -- or who may have wanted to complexify things too,
In the Semantic Web Deployment Working Group, for example, we tried very hard
to keep the base specification of SKOS as simple as possible, and we invoked
the principle of simplicity to keep some features out of scope -- while
explicitly inviting users to extend the model as needed.
I have no basis for judging whether the core PROV model is specified at the
"optimal" degree of complexity, given provenance requirements, for optimizing
applicability and uptake, but based on a superficial reading it feels like alot
of the complexity that could be considered part of an extension or application
of the model is being put into the core model.
> I think as a working group we can do a
> better job of presenting the PROV-DM in a simplified fashion. We're
> working on it but obviously would love the DC's input on ensuring
> that it is simple and that it interacts well with Dublin Core.
In general, it feels like the PROV-DM document is trying to do too many things:
describe a data model, its semantics, provide illustrations, introduce new
notational and graphical conventions, and enumerate rules for inferencing
(though without actually declaring a vocabulary, see below).
My impulse would be to pare this long specification down into something more
focused by moving alot of its contents to related specifications, and to
present users with pathways to adoption that do not implicitly require them to
understand -- and buy into, up-front -- a highly engineered model of provenance
in order to make any statements about provenance at all (arguably the message
implied by PROV-DM in its current form).
My instinct would be to "layer" this work differently by distinguishing between
an underlying vocabulary and domain model -- described as simply as possible
and defined with minimal semantic commitment -- and a more-specific
implementation of the model that adds a layer of specific constraints and
Such a distinction seems to be implicit in the existence of the PROV Ontology
Model specification  alongside PROV-DM. But if it is the role of PROV-O to
define "a set of provenance-specific inference rules", it is unclear how these
inference rules relate to the "constraints" already defined in PROV-DM. For
example, one finds no reference in PROV-O to the PROV-DM constraint
"start-precedes-end". Could one perhaps remove this confusing duplication and
overlap -- and simplify PROV-DM in the process -- by simply moving all
treatment of inferences and constraints to PROV-O?
The Abstract Syntax Notation (PROV-ASN), then, also seems related more
closely to the model of constraints and entailments layered over the
vocabulary. Could it, too, be moved to PROV-O (or perhaps to its own
The remaining PROV-DM spec, then, could explicitly declare the RDF vocabulary
-- basic terms such as prov-dm:wasComplementOf -- something that seems to be
curiously missing from PROV-DM in its current state (for comparison, see ).
I'm assuming here that the vocabulary of PROV-DM is an RDF vocabulary, though
the wording in Section 1.2 makes the nature of the namespace
http://www.w3.org/ns/prov-dm/ seem ambiguous. The explanation -- "All the
elements, relations, reserved names and attributes introduced in this
specification belong to the PROV-DM namespace" -- with its language of
"elements", "relations", and "attributes" -- seems to imply that this is _not_
a "namespace" of RDF or OWL properties and classes. If it is not, the reader
wonders, what sort of things are these "elements" and "attributes"? Is the
PROV-DM specification really based on RDF -- or is it deliberately avoiding RDF
in order to work at a different level of abstraction?
This ambiguity about RDF is reinforced by the special style of the diagrams.
The different shapes of nodes and types of dotted lines make the Graphical
Illustration in 4.3  look "busy" and require the reader to consult an
Appendix. Is there a reason, the reader wonders, that the model is not
depicted simply as an RDF graph?
The pared-down PROV-DM, as I envision it, would be something much more familiar
from a Dublin Core perspective. It would essentially consist of a Domain Model
and an RDF vocabulary -- both defined, ideally, with minimal semantic
commitment. A creator of a Dublin Core-style application profile might then
adapt the Domain Model and draw on the provenance properties and classes to
make a few simple provenance-related statements -- without needing to
understand and buy into the full set of constraints and entailments defined in
PROV-O. Implementors with more sophisticated requirements could, of course,
still turn to the more richly specified constructs of PROV-O.
These comments are based on a very cursory perusal -- no time for a close
reading today...! -- so my apologies if I am overlooking some obvious points.
Please take these comments as "first impressions" -- though first impressions
are perhaps useful to know because they can determine whether an interested
user will keep reading.
> -- Paul Groth, co-chair W3C Provenance Working Group
> Thomas Baker wrote:
> >Dear all,
> >The W3C Provenance WG has published a working draft :
> > The PROV Data Model and Abstract Syntax Notation
> > W3C Working Draft 18 October 2011
> >Ivan Herman, in a blog post , characterizes this as "a core data model for
> >provenance for building representations of the entities, people and processes
> >involved in producing a piece of data or thing in the world".
> >To me, the model  looks pretty complicated and specific -- the sort of model
> >one would design for a particular application profile... In the terms of
> >Singapore Framework , I'm wondering if this model is fundamentally different
> >from a "domain model" defining the basic entities of an application profile?
> > http://www.w3.org/TR/prov-dm/
> > http://www.w3.org/blog/SW/2011/10/18/first-draft-of-a-provenance-data-model-published/
> > http://www.w3.org/TR/prov-dm/overview.png
> > http://dublincore.org/documents/singapore-framework/
Tom Baker <[log in to unmask]>
To unsubscribe from the DC-PROVENANCE list, click the following link: