Please update my email address to be [log in to unmask]
Thanks
-----Original Message-----
From: DCMI Architecture Forum [mailto:[log in to unmask]] On Behalf Of DC-ARCHITECTURE automatic digest system
Sent: Wednesday, February 15, 2012 7:03 PM
To: [log in to unmask]
Subject: DC-ARCHITECTURE Digest - 14 Feb 2012 to 15 Feb 2012 (#2012-23)
There are 14 messages totaling 2954 lines in this issue.
Topics of the day:
1. DCAM - where we stand (2)
2. DCAM telecon - additional links to Gap Analysis and Son of Dublin Core
3. DCAM - collecting requirements and examples (9)
4. Just some food for thought... (2)
----------------------------------------------------------------------
Date: Wed, 15 Feb 2012 10:12:45 +0000
From: "Greenberg, Jane" <[log in to unmask]>
Subject: Re: DCAM - where we stand
Tom, all ...
I really like this initial stab at a general message, and my sense is that the use of the word 'slots' will resonate with folks. At least this is what I think at the moment, and it made the description easy to understand for me. This is what is needed in the user-facing documentation and can reach beyond those immersed in DCAM..and not scare those who are new to this information.
Two brief comments --
~ In sentence one, I wanted to say "connected" slots, but I'm not sure it's necessary. The indication of a 'defined structure' at the end can stand for this. I'm just thinking that is it the linking (connecting) of slots that is key.
~ Perhaps a given, but I think it would be good to list a few examples beyond book. My bias perhaps, but could we list things like a dataset, and image, and I would like to list person and event? (Do 'Things' like person or event cause a problem? They may be seen as unorthodox in DCAM... I'm not sure, or in conflict w/ FRBR?)
Best wishes, jane
-----Original Message-----
From: DCMI Architecture Forum [mailto:[log in to unmask]] On Behalf Of Thomas Baker
Sent: Wednesday, February 15, 2012 12:05 AM
To: [log in to unmask]
Subject: Re: DCAM - where we stand
On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:
> And there was some support for the idea that a clear "punchline" for
> DCAM, along the lines of the one-liner summarizing SKOS in plain
> language, would be helpful.
A bit longer than the one-liner for SKOS, but here's my stab at a "general message" for DCAM:
Seen as data artifacts, Metadata Records consist of slots holding
information items in a defined structure. A Metadata Record may describe a
single Thing of interest (such as a Book) or a cluster of closely related
Things (such as a Book and its Author). More abstractly, a Metadata Record
may be seen as a Description Set encompassing just one Description (i.e.,
about the Book) or multiple Descriptions (about both the Book and the
Author).
A Description consists of one or more Statements about the Thing Described
(e.g., stating the Name and Birthdate of an Author). The Thing Described
by a Description may be identified using a URI. A Statement about the
Thing Described has one slot for an Attribute (Property) and one slot for a
Value. Attribute slots are filled with names of attributes (properties);
in DCAM, attributes are "named" using URIs. Value slots are filled with
Value Strings, URIs, or blank Value Placeholders. A Value String may be
stated as belonging to a named set of strings (known as a Syntax Encoding
Scheme). A Value URI may be stated as belonging to a named set of URIs
(known as a Vocabulary Encoding Scheme). In practice, Statements may be
viewed in the context of Statement Sets. Statement Sets may follow common
patterns.
The Dublin Core Abstract Model (DCAM) provides a language for representing
the structure of specific Metadata Records -- put more abstractly, to
specify a Description Set Profile -- in a form that is independent of
particular Concrete Encoding Technologies such as XML Schema, RDF/XML,
RelaxNG, relational databases, Schematron, or JSON.
In order to provide compatibility with Semantic Web and Linked Data
applications, however, DCAM is fully aligned with the Model and Abstract
Syntax of RDF. (Note that the RDF abstract model is the basis for -- thus
distinct from -- concrete RDF encoding technologies such as RDF/XML,
N-Triples, and Turtle.) Knowledge of RDF is not a prerequisite for
understanding DCAM on an informal level.
DCAM provides a language for expressing common patterns of Statements --
patterns that may be partially or fully encoded using specific Concrete
Encoding Technologies. Indeed, some readers may find the example patterns
used in designing DCAM more accessible and useful, as models and templates
for implementation, than the formal specification of DCAM itself.
Details aside, this text illustrates the sort of high-level description I think we would need to have as an explanation both to our intended audience -- and to guide ourselves in the design phase. I'm not sure whether the mixing of references to syntax ("slots", "Value URI") and semantics ("Thing Described") in this draft is a bug -- or a feature. I also wonder how DCAM can close the gap to expressing the constraints of real application profiles without introducing DC-DSP-like notions such as "templates" and "constraints" [1].
For discussion...
Tom
[1] http://dublincore.org/documents/dc-dsp/
--
Tom Baker <[log in to unmask]<mailto:[log in to unmask]>>
------------------------------
Date: Wed, 15 Feb 2012 08:34:54 -0500
From: Jon Phipps <[log in to unmask]>
Subject: Re: DCAM telecon - additional links to Gap Analysis and Son of Dublin Core
I was always very impressed by SoDC. It's incomplete, but the notion of a
concrete syntax that can be used to express semantics in one branch and
constraints in another with an exemplar processing facility has always made
enormous sense to me and would seem to be a good use case for a DCAP/DCAM
spec.
Jon
On Tuesday, February 14, 2012, Thomas Baker <[log in to unmask]> wrote:
> On Tue, Feb 14, 2012 at 01:32:58PM -0500, Tom Baker wrote:
>> Date: 2012-02-15 Wednesday, 1100 EST
>> Expected: Tom Baker (chair), Mark Matienzo, Antoine Isaac, Stuart
Sutton, Aaron Rubinstein,
>> Phipps, Gordon Dunsire, Kai Eckert
>> Regrets: Corey Harper, Richard Urban
>
> On tomorrow's call I'd like to continue the discussion, even though
> we will have some important absences.
>
> I noticed just after posting that I hadn't added the link to the
(beginnings of
> a) gap analysis [1].
>
> At Corey's request, I included a link to Alistair Miles's Son of Dublin
Core
> (below).
>
> Tom
>
> [1] http://wiki.dublincore.org/index.php/DCAM_Revision_Gap_Analysis
>
>> -- Alistair Miles's Son of Dublin Core (SoDC)
>> http://aliman.googlecode.com/svn/trunk/sodc/SoDC-0.2/index.html
>>
http://aliman.googlecode.com/svn/trunk/sodc/SoDC-0.2/release/SoDC-0_2.zip -
everything, zipped
>
> --
> Tom Baker <[log in to unmask]>
>
--
Jon
I check email just a couple of times daily; to reach me sooner, click here:
http://awayfind.com/jonphipps
------------------------------
Date: Wed, 15 Feb 2012 08:49:32 -0500
From: Jon Phipps <[log in to unmask]>
Subject: Re: DCAM - collecting requirements and examples
I've been doing some wandering around in JSON land for the last few days
and, as part of a continuing observation that RDF is an implementation
detail rather than a core requirement, I'd like to point to this post from
James Snell
http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html
And the JSON Scema spec: http://json-schema.org/
Jon,
who may someday get his act together and pay attention to these meetings
more than a couple of hours before the meeting.
On Tuesday, February 14, 2012, Thomas Baker <[log in to unmask]> wrote:
> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:
>> -- that DCAM should be developed using a test-driven approach, with
>> effective examples and test cases that can be expressed in various
>> concrete syntaxes.
>
> Jon suggested that we take Gordon's requirements for metadata record
constructs
> [1] as a starting point. As I understand them, these are:
>
> -- the ability to encode multicomponent things (which in the cataloging
> world happen to be called "statements", as in "publication statement"
> and "classification statement") either:
>
> -- as unstructured strings, or
> -- as strings structured according to a named Syntax Encoding Scheme,
or
> -- as Named Graphs with individual component triples
>
> -- the ability to express the repeatability of components in such
"statements"
>
> -- the ability to designate properties as "mandatory", or "mandatory if
> applicable", and the like
>
> -- the ability to constrain the cardinality of "subsets of properties"
> within a particular context, such as the FRBR model
>
> -- the ability to express mappings between properties in different
namespaces.
>
> It has also been suggested that we find examples of real metadata instance
> records from different communities and contexts -- e.g., libraries,
government,
> industry, and biomed -- for both testing and illustrating DCAM constracts.
>
> Tom
>
> [1]
https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405
>
> --
> Tom Baker <[log in to unmask]>
>
--
Jon
I check email just a couple of times daily; to reach me sooner, click here:
http://awayfind.com/jonphipps
------------------------------
Date: Wed, 15 Feb 2012 06:47:02 -0800
From: Karen Coyle <[log in to unmask]>
Subject: Re: DCAM - collecting requirements and examples
What *does* seem to be core in this blog post is the use of http URIs
for values. I'd add to that: properties defined with http URIs, so you
know what you are describing. Although you can serialize all of this in
JSON if you wish, it means that you have started with LD concepts, not
the usual JSON application. Underneath it all you still have to have
something that expresses valid triples, n'est pas?
kc
On 2/15/12 5:49 AM, Jon Phipps wrote:
> I've been doing some wandering around in JSON land for the last few days
> and, as part of a continuing observation that RDF is an implementation
> detail rather than a core requirement, I'd like to point to this post from
> James Snell
> http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html
> And the JSON Scema spec: http://json-schema.org/
>
> Jon,
> who may someday get his act together and pay attention to these meetings
> more than a couple of hours before the meeting.
>
> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]> wrote:
>> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:
>>> -- that DCAM should be developed using a test-driven approach, with
>>> effective examples and test cases that can be expressed in various
>>> concrete syntaxes.
>>
>> Jon suggested that we take Gordon's requirements for metadata record
> constructs
>> [1] as a starting point. As I understand them, these are:
>>
>> -- the ability to encode multicomponent things (which in the cataloging
>> world happen to be called "statements", as in "publication statement"
>> and "classification statement") either:
>>
>> -- as unstructured strings, or
>> -- as strings structured according to a named Syntax Encoding Scheme,
> or
>> -- as Named Graphs with individual component triples
>>
>> -- the ability to express the repeatability of components in such
> "statements"
>>
>> -- the ability to designate properties as "mandatory", or "mandatory if
>> applicable", and the like
>>
>> -- the ability to constrain the cardinality of "subsets of properties"
>> within a particular context, such as the FRBR model
>>
>> -- the ability to express mappings between properties in different
> namespaces.
>>
>> It has also been suggested that we find examples of real metadata instance
>> records from different communities and contexts -- e.g., libraries,
> government,
>> industry, and biomed -- for both testing and illustrating DCAM constracts.
>>
>> Tom
>>
>> [1]
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405
>>
>> --
>> Tom Baker<[log in to unmask]>
>>
>
--
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
------------------------------
Date: Wed, 15 Feb 2012 09:38:37 -0500
From: Richard Urban <[log in to unmask]>
Subject: Re: DCAM - where we stand
Hi everyone,
Sorry I won't be able to join the call today, so I thought I'd send a few comments to the list.
Tom, I believe "slots" are a new introduction to the DCAM (at least from my reading), but it does seem to introduce the possibility of confusion with how "slots" are used in frame-based languages (http://en.wikipedia.org/wiki/Frame_language). Is the concept of slots we are introducing here equivalent to those kinds of slots or are we introducing a DCAM specific concept? If the latter, it may help to spell out what the features of these slots are (beyond properties/values).
> I'm not sure whether the mixing of
> references to syntax ("slots", "Value URI") and semantics ("Thing Described")
> in this draft is a bug -- or a feature.
So "slots" are a syntactical concept? If we are basing DCAM on the RDF model, it seems that "semantics" also incorporates the abstract construction of grammatical features like "statements." I would therefore expect "slots" to be at that abstract level.
Following that thread, are DCAM Statements still just be property/value slots, or are they now more like a triple, with a (for lack of a better term) slot that holds a URI that refers to the Thing Described? An important part of preserving intuitive sense of colloquial records is that such URIs are optional. I think this is equivalent to RDFs concept of blank nodes, but I don't think that connection has been explicitly drawn in DCAM. (Kai?) If DCAM is standing slightly apart from RDF to accomodate colloquial XML records[1] , does our sense of statements/described resource URIs align with this concept? The Linked Data community is discouraging blank nodes, so we could imagine a DCAM that also requires URIs for all Things. (I suspect that this is not what we want to do, but putting it out there for discussion). (see "In order to provide compatibility with Semantic Web and Linked Data applications, however, DCAM is fully aligned with the Model and Abstract Syntax of RDF"). It also seems that Linked Data's requirement to including Thing URIs is the kind of constraints that we might be modeling at the level of DCAM and might afford some useful real-world examples of how it's done. ( how is this similar/different to the kinds of constraints we need to express at the DSP level?)
Lastly, since we are also trying to work backwards from XML, we frequently using the term "records" in discussing DCAM. i.e.
> The Dublin Core Abstract Model (DCAM) provides a language for representing
> the structure of specific Metadata Records -- put more abstractly, to
> specify a Description Set Profile -- in a form that is independent of
> particular Concrete Encoding Technologies such as XML Schema, RDF/XML,
> RelaxNG, relational databases, Schematron, or JSON.
I don't think DCAM is after Description Set Profiles directly, rather it models our intuitive sense of "records" as an abstract "Description Set." (which then enables us to specify a DSP). I think it does pretty well at this, but I'm curious if there are objections to "Description Sets" that should go into our gap analysis. Jon et al. are there things about the kinds of records you work with that don't fit the current DCAM model?
Richard J. Urban, Visiting Professor
School of Library and Information Studies
College of Communication and Information
Florida State University
[log in to unmask]
@musebrarian
[1] Sperberg-McQueen, C.M. and Miller, E. On mapping from colloquial XML to RDF using XSLT. Proceedings of Extreme Markup Languages 2004. http://conferences.idealliance.org/extreme/html/2004/Sperberg-McQueen01/EML2004Sperberg-McQueen01.html
On Feb 14, 2012, at 6:04 PM, Thomas Baker wrote:
> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:
>> And there was some support for the idea that a clear "punchline" for
>> DCAM, along the lines of the one-liner summarizing SKOS in plain language,
>> would be helpful.
>
> A bit longer than the one-liner for SKOS, but here's my stab at
> a "general message" for DCAM:
>
> Seen as data artifacts, Metadata Records consist of slots holding
> information items in a defined structure. A Metadata Record may describe a
> single Thing of interest (such as a Book) or a cluster of closely related
> Things (such as a Book and its Author). More abstractly, a Metadata Record
> may be seen as a Description Set encompassing just one Description (i.e.,
> about the Book) or multiple Descriptions (about both the Book and the
> Author).
>
> A Description consists of one or more Statements about the Thing Described
> (e.g., stating the Name and Birthdate of an Author). The Thing Described
> by a Description may be identified using a URI. A Statement about the
> Thing Described has one slot for an Attribute (Property) and one slot for a
> Value. Attribute slots are filled with names of attributes (properties);
> in DCAM, attributes are "named" using URIs. Value slots are filled with
> Value Strings, URIs, or blank Value Placeholders. A Value String may be
> stated as belonging to a named set of strings (known as a Syntax Encoding
> Scheme). A Value URI may be stated as belonging to a named set of URIs
> (known as a Vocabulary Encoding Scheme). In practice, Statements may be
> viewed in the context of Statement Sets. Statement Sets may follow common
> patterns.
>
> The Dublin Core Abstract Model (DCAM) provides a language for representing
> the structure of specific Metadata Records -- put more abstractly, to
> specify a Description Set Profile -- in a form that is independent of
> particular Concrete Encoding Technologies such as XML Schema, RDF/XML,
> RelaxNG, relational databases, Schematron, or JSON.
>
> In order to provide compatibility with Semantic Web and Linked Data
> applications, however, DCAM is fully aligned with the Model and Abstract
> Syntax of RDF. (Note that the RDF abstract model is the basis for -- thus
> distinct from -- concrete RDF encoding technologies such as RDF/XML,
> N-Triples, and Turtle.) Knowledge of RDF is not a prerequisite for
> understanding DCAM on an informal level.
>
> DCAM provides a language for expressing common patterns of Statements --
> patterns that may be partially or fully encoded using specific Concrete
> Encoding Technologies. Indeed, some readers may find the example patterns
> used in designing DCAM more accessible and useful, as models and templates
> for implementation, than the formal specification of DCAM itself.
>
> Details aside, this text illustrates the sort of high-level description I think
> we would need to have as an explanation both to our intended audience -- and to
> guide ourselves in the design phase. I'm not sure whether the mixing of
> references to syntax ("slots", "Value URI") and semantics ("Thing Described")
> in this draft is a bug -- or a feature. I also wonder how DCAM can close the
> gap to expressing the constraints of real application profiles without
> introducing DC-DSP-like notions such as "templates" and "constraints" [1].
>
> For discussion...
>
> Tom
>
> [1] http://dublincore.org/documents/dc-dsp/
>
> --
> Tom Baker <[log in to unmask]>
>
------------------------------
Date: Wed, 15 Feb 2012 09:50:03 -0500
From: Richard Urban <[log in to unmask]>
Subject: Re: Just some food for thought...
Cory/Karen,
Are there any good summaries of the conversations in Seattle (relevant to this discussion) for those of us who didn't make it to #c4lib?
Thanks,
Richard
On Feb 5, 2012, at 3:25 PM, Karen Coyle wrote:
> On 2/2/12 10:14 AM, Corey A Harper wrote:
>
>>
>> I'm open to other suggestions about where we can reach out to for some
>> additional perspective.
>
> It's not only a matter of "where" it's a matter of "how." We've all been in on the lengthy conversations about terminology (and some of us went through that again at length at a meeting in Seattle last week). You can't expect much when you invite Russian speakers to a discussion taking place only in Latin. The DCAM terminology is a barrier. You can claim that
> 1) that terminology is necessary
> 2) people need to make the effort to learn it
>
> but that approach may not lead to success, as I believe is the case with the current version of DCAM. Reaching out should mean at least meeting people half way and doing all that is possible to bring them along. "Sink or swim" isn't an invitation.
>
> I actually believe that the utility of DCAM must be and can be expressed in terms of things people know and need to accomplish in their own environments. Examples and use cases will be a big help. That may even been a good place to start on this "round 2" effort: looking at what DCAM gives us as practitioners could reveal what else is needed, if anything, from such a model.
>
> kc
>
>>
>> On Thu, Feb 2, 2012 at 8:17 AM, Bruce D'Arcus<[log in to unmask]> wrote:
>>> On Thu, Feb 2, 2012 at 10:14 AM, Jon Phipps<[log in to unmask]> wrote:
>>>> This post represents an interesting perspective from the scientific data
>>>> community on some of the challenges to implementing semantic web solutions
>>>> and integrating them into existing system architectures and programming
>>>> models. This certainly looks to me like a place where the DCAP/DCAM
>>>> architecture coupled with some concrete implementation examples could be of
>>>> benefit...
>>>
>>> FWIW, I think the issue is much less about models than it is about the
>>> other stuff.
>>>
>>> Bruce
>>
>>
>>
>
> --
> Karen Coyle
> [log in to unmask] http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
------------------------------
Date: Wed, 15 Feb 2012 10:55:01 -0500
From: Thomas Baker <[log in to unmask]>
Subject: Re: DCAM - collecting requirements and examples
On Wed, Feb 15, 2012 at 08:49:32AM -0500, Jon Phipps wrote:
> I've been doing some wandering around in JSON land for the last few days
> and, as part of a continuing observation that RDF is an implementation
> detail rather than a core requirement, I'd like to point to this post from
> James Snell
> http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html
> And the JSON Scema spec: http://json-schema.org/
It looks to me like he considers RDF to be a "format" and, as such,
comparable to JSON. Commenting on [1], he writes:
Reading on a little further, the document goes on to expand on that third
point, "In order to enable a wide range of different applications to
process Web content, it is important to agree on standardized content
formats. The agreement on HTML as a dominant document format was an
important factor that made the Web scale. The third Linked Data principle
therefore advocates use of a single data model for publishing structured
data on the Web – the Resource Description Framework (RDF), a simple
graph-based data model that has been designed for use in the context of the
Web [70]. The RDF data model is explained in more detail later in this
chapter."
I can absolutely agree with the first part -- that standardized content
formats are critical. But the "single data model" bit makes me twitch. We
don't need a single data model.. what we need are common conventions for
pulling out the bits of information we need regardless of the specific
format used.
...i.e., in my reading, he is equating "data model" with a "specific format".
As I proposed yesterday, I think it is important to distinguish between RDF
"the model and abstract syntax" and RDF/XML "the concrete serialization syntax,
or format" -- not to mention other concrete RDF syntaxes such as N-Triples and
Turtle -- in DCAM's general message:
The Dublin Core Abstract Model (DCAM) provides a language for representing
the structure of specific Metadata Records -- put more abstractly, to
specify a Description Set Profile -- in a form that is independent of
particular Concrete Encoding Technologies such as XML Schema, RDF/XML,
RelaxNG, relational databases, Schematron, or JSON.
In order to provide compatibility with Semantic Web and Linked Data
applications, however, DCAM is fully aligned with the Model and Abstract
Syntax of RDF. (Note that the RDF abstract model is the basis for -- thus
distinct from -- concrete RDF encoding technologies such as RDF/XML,
N-Triples, and Turtle.) Knowledge of RDF is not a prerequisite for
understanding DCAM on an informal level.
It would help if we could agree on a way to characterize this distinction
(e.g., "Concrete Encoding Technologies" versus "Model and Abstract Syntax").
Unless I'm missing the point of his argument, I do not think James Snell is
proposing JSON Activity Streams as a generic abstract syntax -- something which
would compete with RDF as a "grammatical" basis for interoperability in Linked
Data. He emphasizes his point that "If you're familiar with Activity Streams
and the linking extensions, then you'll know exactly what to do with this."
That seems consistent with what we want to do with DCAM -- with the added
distinction that if a JSON format is aligned with DCAM, and DCAM is aligned
with RDF, then one would in principle be able to express the contents of a JSON
format using an RDF concrete syntax. Indeed, James's formulation that "what we
need are common conventions for pulling out the bits of information we need
regardless of the specific format used" could almost be used verbatim in a
description of the DCAM we are discussing.
Jon writes:
> as part of a continuing observation that RDF is an implementation
> detail rather than a core requirement...
I am coming around to the idea that DCAM (or at any rate, "DCAM 2") might be
presented informally without emphasizing RDF, and that some people might find
such a DCAM useful as a very high-level way to conceptualize metadata (i.e.,
Statements, composed of Slots for information and grouped into Descriptions and
Description Sets, following common design patterns, etc...) I still do not see
the value of specifying a DCAM that is anything less than perfectly aligned
with the RDF Model and Abstract Syntax. That people may take inspiration from
such an RDF-grounded model, ignoring the RDF basis, is not something we should
worry about. But RDF, such as it is, is the only common _grammatical_ basis
for data that we currently have, and not to ground DCAM in RDF would make it
useless for the purposes of RDF-based interoperability.
Tom
[1] http://linkeddatabook.com/editions/1.0/
--
Tom Baker <[log in to unmask]>
------------------------------
Date: Wed, 15 Feb 2012 10:58:53 -0500
From: Jon Phipps <[log in to unmask]>
Subject: Re: DCAM - collecting requirements and examples
Re: "Underneath it all you still have to have something that expresses
valid triples, n'est pas?"
Actually, my point here is that there are many data serializations, models
and use cases for creating, validating, and distributing metadata and many
of them don't include a notion of triples, (e.g. nosql) although many of
them do include a notion of domain-specific validity and some form of
distribution. RDF is extremely useful for distributing metadata in an Open
World context, but it's hardly the only data model and hardly the only
method of distributing useful metadata.
We need to provide, or at least try to provide, a specification that makes
it possible for an organization to describe how they expect the 'things'
they know about to be described: which properties are valid or not, what
constitutes valid data, and what does each property mean. In the old days,
this model used to be called a 'data dictionary' and it's an incredibly
useful concept in a world of distributed heterogeneous data. Providing a
way for someone to create a single 'data dictionary' that can be used
(preferably by a machine) to create validations for domain-specific data
and that can be used by anyone (preferably a machine) in the organization,
or alternatively in the world, to understand the meaning of that data
across departmental, organizational, or national boundaries would be
incredibly and fundamentally useful.
If we say that RDF is the ONLY useful way to do this, then we might as well
go back to "DCAM is just RDF".
Jon
I check email just a couple of times daily; to reach me sooner, click here:
http://awayfind.com/jonphipps
On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle <[log in to unmask]> wrote:
> What *does* seem to be core in this blog post is the use of http URIs for
> values. I'd add to that: properties defined with http URIs, so you know
> what you are describing. Although you can serialize all of this in JSON if
> you wish, it means that you have started with LD concepts, not the usual
> JSON application. Underneath it all you still have to have something that
> expresses valid triples, n'est pas?
>
> kc
>
>
> On 2/15/12 5:49 AM, Jon Phipps wrote:
>
>> I've been doing some wandering around in JSON land for the last few days
>> and, as part of a continuing observation that RDF is an implementation
>> detail rather than a core requirement, I'd like to point to this post from
>> James Snell
>> http://chmod777self.blogspot.**com/2012/02/mostly-linked-**data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>
>> And the JSON Scema spec: http://json-schema.org/
>>
>> Jon,
>> who may someday get his act together and pay attention to these meetings
>> more than a couple of hours before the meeting.
>>
>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]> wrote:
>>
>>> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:
>>>
>>>> -- that DCAM should be developed using a test-driven approach, with
>>>> effective examples and test cases that can be expressed in various
>>>> concrete syntaxes.
>>>>
>>>
>>> Jon suggested that we take Gordon's requirements for metadata record
>>>
>> constructs
>>
>>> [1] as a starting point. As I understand them, these are:
>>>
>>> -- the ability to encode multicomponent things (which in the cataloging
>>> world happen to be called "statements", as in "publication statement"
>>> and "classification statement") either:
>>>
>>> -- as unstructured strings, or
>>> -- as strings structured according to a named Syntax Encoding Scheme,
>>>
>> or
>>
>>> -- as Named Graphs with individual component triples
>>>
>>> -- the ability to express the repeatability of components in such
>>>
>> "statements"
>>
>>>
>>> -- the ability to designate properties as "mandatory", or "mandatory if
>>> applicable", and the like
>>>
>>> -- the ability to constrain the cardinality of "subsets of properties"
>>> within a particular context, such as the FRBR model
>>>
>>> -- the ability to express mappings between properties in different
>>>
>> namespaces.
>>
>>>
>>> It has also been suggested that we find examples of real metadata
>>> instance
>>> records from different communities and contexts -- e.g., libraries,
>>>
>> government,
>>
>>> industry, and biomed -- for both testing and illustrating DCAM
>>> constracts.
>>>
>>> Tom
>>>
>>> [1]
>>>
>> https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**
>> dc-architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>
>>
>>>
>>> --
>>> Tom Baker<[log in to unmask]>
>>>
>>>
>>
> --
> Karen Coyle
> [log in to unmask] http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
>
------------------------------
Date: Wed, 15 Feb 2012 11:07:49 -0500
From: Jon Phipps <[log in to unmask]>
Subject: Re: Just some food for thought...
On Sun, Feb 5, 2012 at 3:25 PM, Karen Coyle <[log in to unmask]> wrote:
> I actually believe that the utility of DCAM must be and can be expressed
> in terms of things people know and need to accomplish in their own
> environments. Examples and use cases will be a big help. That may even been
> a good place to start on this "round 2" effort: looking at what DCAM gives
> us as practitioners could reveal what else is needed, if anything, from
> such a model.
+1 :-)
------------------------------
Date: Wed, 15 Feb 2012 08:26:12 -0800
From: Karen Coyle <[log in to unmask]>
Subject: Re: DCAM - collecting requirements and examples
Hmm. In terms of analogies, I would equate DCAP with a data dictionary,
not DCAM. To me a data dictionary is the actual metadata elements you
will use, not an abstract definition of the possible structures. DCAM
seems to be closer to the idea of "design patterns."
I don't see how something can be linked data if it doesn't have certain
characteristics:
- http uris
- subjects, predicates, objects (whether serialized as triples or not,
and RDF/XML and turtle are examples of not)
- subjects and predicates constrained as URIs; objects constrained
differently (which DCAM would address)
It's possible that the JSON examples in that blog post met these
criteria (I didn't perceive URIs for the predicates, but maybe I don't
read JSON well).
kc
On 2/15/12 7:58 AM, Jon Phipps wrote:
> Re: "Underneath it all you still have to have something that expresses
> valid triples, n'est pas?"
>
> Actually, my point here is that there are many data serializations, models
> and use cases for creating, validating, and distributing metadata and many
> of them don't include a notion of triples, (e.g. nosql) although many of
> them do include a notion of domain-specific validity and some form of
> distribution. RDF is extremely useful for distributing metadata in an Open
> World context, but it's hardly the only data model and hardly the only
> method of distributing useful metadata.
>
> We need to provide, or at least try to provide, a specification that makes
> it possible for an organization to describe how they expect the 'things'
> they know about to be described: which properties are valid or not, what
> constitutes valid data, and what does each property mean. In the old days,
> this model used to be called a 'data dictionary' and it's an incredibly
> useful concept in a world of distributed heterogeneous data. Providing a
> way for someone to create a single 'data dictionary' that can be used
> (preferably by a machine) to create validations for domain-specific data
> and that can be used by anyone (preferably a machine) in the organization,
> or alternatively in the world, to understand the meaning of that data
> across departmental, organizational, or national boundaries would be
> incredibly and fundamentally useful.
>
> If we say that RDF is the ONLY useful way to do this, then we might as well
> go back to "DCAM is just RDF".
>
> Jon
>
> I check email just a couple of times daily; to reach me sooner, click here:
> http://awayfind.com/jonphipps
>
>
> On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle<[log in to unmask]> wrote:
>
>> What *does* seem to be core in this blog post is the use of http URIs for
>> values. I'd add to that: properties defined with http URIs, so you know
>> what you are describing. Although you can serialize all of this in JSON if
>> you wish, it means that you have started with LD concepts, not the usual
>> JSON application. Underneath it all you still have to have something that
>> expresses valid triples, n'est pas?
>>
>> kc
>>
>>
>> On 2/15/12 5:49 AM, Jon Phipps wrote:
>>
>>> I've been doing some wandering around in JSON land for the last few days
>>> and, as part of a continuing observation that RDF is an implementation
>>> detail rather than a core requirement, I'd like to point to this post from
>>> James Snell
>>> http://chmod777self.blogspot.**com/2012/02/mostly-linked-**data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>
>>> And the JSON Scema spec: http://json-schema.org/
>>>
>>> Jon,
>>> who may someday get his act together and pay attention to these meetings
>>> more than a couple of hours before the meeting.
>>>
>>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]> wrote:
>>>
>>>> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:
>>>>
>>>>> -- that DCAM should be developed using a test-driven approach, with
>>>>> effective examples and test cases that can be expressed in various
>>>>> concrete syntaxes.
>>>>>
>>>>
>>>> Jon suggested that we take Gordon's requirements for metadata record
>>>>
>>> constructs
>>>
>>>> [1] as a starting point. As I understand them, these are:
>>>>
>>>> -- the ability to encode multicomponent things (which in the cataloging
>>>> world happen to be called "statements", as in "publication statement"
>>>> and "classification statement") either:
>>>>
>>>> -- as unstructured strings, or
>>>> -- as strings structured according to a named Syntax Encoding Scheme,
>>>>
>>> or
>>>
>>>> -- as Named Graphs with individual component triples
>>>>
>>>> -- the ability to express the repeatability of components in such
>>>>
>>> "statements"
>>>
>>>>
>>>> -- the ability to designate properties as "mandatory", or "mandatory if
>>>> applicable", and the like
>>>>
>>>> -- the ability to constrain the cardinality of "subsets of properties"
>>>> within a particular context, such as the FRBR model
>>>>
>>>> -- the ability to express mappings between properties in different
>>>>
>>> namespaces.
>>>
>>>>
>>>> It has also been suggested that we find examples of real metadata
>>>> instance
>>>> records from different communities and contexts -- e.g., libraries,
>>>>
>>> government,
>>>
>>>> industry, and biomed -- for both testing and illustrating DCAM
>>>> constracts.
>>>>
>>>> Tom
>>>>
>>>> [1]
>>>>
>>> https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**
>>> dc-architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>
>>>
>>>>
>>>> --
>>>> Tom Baker<[log in to unmask]>
>>>>
>>>>
>>>
>> --
>> Karen Coyle
>> [log in to unmask] http://kcoyle.net
>> ph: 1-510-540-7596
>> m: 1-510-435-8234
>> skype: kcoylenet
>>
>
--
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
------------------------------
Date: Wed, 15 Feb 2012 18:30:09 +0100
From: Kai Eckert <[log in to unmask]>
Subject: Re: DCAM - collecting requirements and examples
Hi all,
as a strong promoter for the RDF basis for DCAM, I would like to
emphasize, too, that RDF is only the formal model and should not be seen
as a concrete syntax. JSON is a syntax, it has no semantics. I like it
very much, and I like simple, pragmatic implementations, but that's not
what we need in our current context.
In the W3C provenance WG, we just had the experience, that it is much
easier to discuss a model that is defined in a formal language, in
contrast to plain English, which lead to endless discussions before. We
now focus on the formal PROV ontology, written in OWL, to reach a
consensus about the model. Additionally, we of course create documents
in plain English (at least) that hopefully explain and demonstrate what
can be done with the model. But these drafts can not be used to define
the model in the first place.
I think the only formal language that we all speak is {RDF,RDFS, OWL},
that's why I want to focus on the definition of everything that we are
talking about in DCAM with this language. In that respect, it is more a
side-effect that this would end in actually being RDF. If we face
limitations in this formal language that we can not accept, then of
course we should not restrict ourselves to RDF. But only then.
Cheers,
Kai
Am 15.02.2012 16:55, schrieb Thomas Baker:
> On Wed, Feb 15, 2012 at 08:49:32AM -0500, Jon Phipps wrote:
>> I've been doing some wandering around in JSON land for the last few days
>> and, as part of a continuing observation that RDF is an implementation
>> detail rather than a core requirement, I'd like to point to this post from
>> James Snell
>> http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html
>> And the JSON Scema spec: http://json-schema.org/
>
> It looks to me like he considers RDF to be a "format" and, as such,
> comparable to JSON. Commenting on [1], he writes:
>
> Reading on a little further, the document goes on to expand on that third
> point, "In order to enable a wide range of different applications to
> process Web content, it is important to agree on standardized content
> formats. The agreement on HTML as a dominant document format was an
> important factor that made the Web scale. The third Linked Data principle
> therefore advocates use of a single data model for publishing structured
> data on the Web – the Resource Description Framework (RDF), a simple
> graph-based data model that has been designed for use in the context of the
> Web [70]. The RDF data model is explained in more detail later in this
> chapter."
>
> I can absolutely agree with the first part -- that standardized content
> formats are critical. But the "single data model" bit makes me twitch. We
> don't need a single data model.. what we need are common conventions for
> pulling out the bits of information we need regardless of the specific
> format used.
>
> ...i.e., in my reading, he is equating "data model" with a "specific format".
> As I proposed yesterday, I think it is important to distinguish between RDF
> "the model and abstract syntax" and RDF/XML "the concrete serialization syntax,
> or format" -- not to mention other concrete RDF syntaxes such as N-Triples and
> Turtle -- in DCAM's general message:
>
> The Dublin Core Abstract Model (DCAM) provides a language for representing
> the structure of specific Metadata Records -- put more abstractly, to
> specify a Description Set Profile -- in a form that is independent of
> particular Concrete Encoding Technologies such as XML Schema, RDF/XML,
> RelaxNG, relational databases, Schematron, or JSON.
>
> In order to provide compatibility with Semantic Web and Linked Data
> applications, however, DCAM is fully aligned with the Model and Abstract
> Syntax of RDF. (Note that the RDF abstract model is the basis for -- thus
> distinct from -- concrete RDF encoding technologies such as RDF/XML,
> N-Triples, and Turtle.) Knowledge of RDF is not a prerequisite for
> understanding DCAM on an informal level.
>
> It would help if we could agree on a way to characterize this distinction
> (e.g., "Concrete Encoding Technologies" versus "Model and Abstract Syntax").
>
> Unless I'm missing the point of his argument, I do not think James Snell is
> proposing JSON Activity Streams as a generic abstract syntax -- something which
> would compete with RDF as a "grammatical" basis for interoperability in Linked
> Data. He emphasizes his point that "If you're familiar with Activity Streams
> and the linking extensions, then you'll know exactly what to do with this."
> That seems consistent with what we want to do with DCAM -- with the added
> distinction that if a JSON format is aligned with DCAM, and DCAM is aligned
> with RDF, then one would in principle be able to express the contents of a JSON
> format using an RDF concrete syntax. Indeed, James's formulation that "what we
> need are common conventions for pulling out the bits of information we need
> regardless of the specific format used" could almost be used verbatim in a
> description of the DCAM we are discussing.
>
> Jon writes:
>> as part of a continuing observation that RDF is an implementation
>> detail rather than a core requirement...
>
> I am coming around to the idea that DCAM (or at any rate, "DCAM 2") might be
> presented informally without emphasizing RDF, and that some people might find
> such a DCAM useful as a very high-level way to conceptualize metadata (i.e.,
> Statements, composed of Slots for information and grouped into Descriptions and
> Description Sets, following common design patterns, etc...) I still do not see
> the value of specifying a DCAM that is anything less than perfectly aligned
> with the RDF Model and Abstract Syntax. That people may take inspiration from
> such an RDF-grounded model, ignoring the RDF basis, is not something we should
> worry about. But RDF, such as it is, is the only common _grammatical_ basis
> for data that we currently have, and not to ground DCAM in RDF would make it
> useless for the purposes of RDF-based interoperability.
>
> Tom
>
> [1] http://linkeddatabook.com/editions/1.0/
>
>
--
Kai Eckert
Universitätsbibliothek Mannheim
Stellv. Leiter Abteilung Digitale Bibliotheksdienste
Schloss Schneckhof West / 68131 Mannheim
Tel. 0621/181-2946 Fax 0621/181-2918
------------------------------
Date: Wed, 15 Feb 2012 15:30:35 -0500
From: Jon Phipps <[log in to unmask]>
Subject: Re: DCAM - collecting requirements and examples
Are we only talking about Linked Data or are we talking about information
modeling? DCAP is a documentation model for describing an information
ecosystem and DCAM is its formal abstract 'domain' model, or should be.
Whether or not that model results in Linked Data is beside the point, isn't
it?
Jon,
who just found this and had to paste it here:
Endless invention, endless experiment,
Brings knowledge of motion, but not of stillness;
Knowledge of speech, but not of silence;
Knowledge of words, and ignorance of the Word...
Where is the Life we have lost in living?
Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?
-- T. S. Eliot, Choruses from 'The Rock'
On Wed, Feb 15, 2012 at 11:26 AM, Karen Coyle <[log in to unmask]> wrote:
> Hmm. In terms of analogies, I would equate DCAP with a data dictionary,
> not DCAM. To me a data dictionary is the actual metadata elements you will
> use, not an abstract definition of the possible structures. DCAM seems to
> be closer to the idea of "design patterns."
>
> I don't see how something can be linked data if it doesn't have certain
> characteristics:
> - http uris
> - subjects, predicates, objects (whether serialized as triples or not, and
> RDF/XML and turtle are examples of not)
> - subjects and predicates constrained as URIs; objects constrained
> differently (which DCAM would address)
>
> It's possible that the JSON examples in that blog post met these criteria
> (I didn't perceive URIs for the predicates, but maybe I don't read JSON
> well).
>
> kc
>
>
> On 2/15/12 7:58 AM, Jon Phipps wrote:
>
>> Re: "Underneath it all you still have to have something that expresses
>> valid triples, n'est pas?"
>>
>> Actually, my point here is that there are many data serializations, models
>> and use cases for creating, validating, and distributing metadata and many
>> of them don't include a notion of triples, (e.g. nosql) although many of
>> them do include a notion of domain-specific validity and some form of
>> distribution. RDF is extremely useful for distributing metadata in an Open
>> World context, but it's hardly the only data model and hardly the only
>> method of distributing useful metadata.
>>
>> We need to provide, or at least try to provide, a specification that makes
>> it possible for an organization to describe how they expect the 'things'
>> they know about to be described: which properties are valid or not, what
>> constitutes valid data, and what does each property mean. In the old days,
>> this model used to be called a 'data dictionary' and it's an incredibly
>> useful concept in a world of distributed heterogeneous data. Providing a
>> way for someone to create a single 'data dictionary' that can be used
>> (preferably by a machine) to create validations for domain-specific data
>> and that can be used by anyone (preferably a machine) in the organization,
>> or alternatively in the world, to understand the meaning of that data
>> across departmental, organizational, or national boundaries would be
>> incredibly and fundamentally useful.
>>
>> If we say that RDF is the ONLY useful way to do this, then we might as
>> well
>> go back to "DCAM is just RDF".
>>
>> Jon
>>
>> I check email just a couple of times daily; to reach me sooner, click
>> here:
>> http://awayfind.com/jonphipps
>>
>>
>> On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle<[log in to unmask]> wrote:
>>
>> What *does* seem to be core in this blog post is the use of http URIs for
>>> values. I'd add to that: properties defined with http URIs, so you know
>>> what you are describing. Although you can serialize all of this in JSON
>>> if
>>> you wish, it means that you have started with LD concepts, not the usual
>>> JSON application. Underneath it all you still have to have something that
>>> expresses valid triples, n'est pas?
>>>
>>> kc
>>>
>>>
>>> On 2/15/12 5:49 AM, Jon Phipps wrote:
>>>
>>> I've been doing some wandering around in JSON land for the last few days
>>>> and, as part of a continuing observation that RDF is an implementation
>>>> detail rather than a core requirement, I'd like to point to this post
>>>> from
>>>> James Snell
>>>> http://chmod777self.blogspot.****com/2012/02/mostly-linked-****
>>>> data.html<http://chmod777self.**blogspot.com/2012/02/mostly-**
>>>> linked-data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>
>>>> >
>>>>
>>>> And the JSON Scema spec: http://json-schema.org/
>>>>
>>>> Jon,
>>>> who may someday get his act together and pay attention to these meetings
>>>> more than a couple of hours before the meeting.
>>>>
>>>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]> wrote:
>>>>
>>>> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:
>>>>>
>>>>> -- that DCAM should be developed using a test-driven approach, with
>>>>>> effective examples and test cases that can be expressed in various
>>>>>> concrete syntaxes.
>>>>>>
>>>>>>
>>>>> Jon suggested that we take Gordon's requirements for metadata record
>>>>>
>>>>> constructs
>>>>
>>>> [1] as a starting point. As I understand them, these are:
>>>>>
>>>>> -- the ability to encode multicomponent things (which in the
>>>>> cataloging
>>>>> world happen to be called "statements", as in "publication
>>>>> statement"
>>>>> and "classification statement") either:
>>>>>
>>>>> -- as unstructured strings, or
>>>>> -- as strings structured according to a named Syntax Encoding
>>>>> Scheme,
>>>>>
>>>>> or
>>>>
>>>> -- as Named Graphs with individual component triples
>>>>>
>>>>> -- the ability to express the repeatability of components in such
>>>>>
>>>>> "statements"
>>>>
>>>>
>>>>> -- the ability to designate properties as "mandatory", or "mandatory
>>>>> if
>>>>> applicable", and the like
>>>>>
>>>>> -- the ability to constrain the cardinality of "subsets of properties"
>>>>> within a particular context, such as the FRBR model
>>>>>
>>>>> -- the ability to express mappings between properties in different
>>>>>
>>>>> namespaces.
>>>>
>>>>
>>>>> It has also been suggested that we find examples of real metadata
>>>>> instance
>>>>> records from different communities and contexts -- e.g., libraries,
>>>>>
>>>>> government,
>>>>
>>>> industry, and biomed -- for both testing and illustrating DCAM
>>>>> constracts.
>>>>>
>>>>> Tom
>>>>>
>>>>> [1]
>>>>>
>>>>> https://www.jiscmail.ac.uk/****cgi-bin/webadmin?A2=ind1202&L=****<https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**>
>>>> dc-architecture&P=6405<https:/**/www.jiscmail.ac.uk/cgi-bin/**
>>>> webadmin?A2=ind1202&L=dc-**architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>
>>>> >
>>>>
>>>>
>>>>> --
>>>>> Tom Baker<[log in to unmask]>
>>>>>
>>>>>
>>>>>
>>>> --
>>> Karen Coyle
>>> [log in to unmask] http://kcoyle.net
>>> ph: 1-510-540-7596
>>> m: 1-510-435-8234
>>> skype: kcoylenet
>>>
>>>
>>
> --
> Karen Coyle
> [log in to unmask] http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
>
------------------------------
Date: Wed, 15 Feb 2012 14:03:20 -0800
From: Karen Coyle <[log in to unmask]>
Subject: Re: DCAM - collecting requirements and examples
On 2/15/12 12:30 PM, Jon Phipps wrote:
> Are we only talking about Linked Data or are we talking about information
> modeling?
I doubt if it makes sense to take on the entirety of information
modeling within the DCAM. My impression was that the goal was
information modeling for the Semantic Web/linked data environment. If
it's broader than that, then we move up from abstract to something so
far out it may never be finished. I'd say that we should stick with an
abstraction that is an abstraction of something useful, not a pure
abstraction.
DCAP is a documentation model for describing an information
> ecosystem and DCAM is its formal abstract 'domain' model, or should be.
DCAP to me is narrower than that. It describes a coherent set of
statements for a particular metadata activity. It verges on being a
record format, although it is a "record format" in a data environment
that is more flexible than, say, a relational database with a set data
format. I'd equate the Singapore framework's domain model with an
"information ecosystem." That to me is the general model before you
start adding constraints, and perhaps even before you define your set of
properties.
And, as I said before, DCAM to me defines the design patterns that are
available to you. It is plausible to me that DCAM's patterns have some
universality, but I wouldn't want to embark on a task of making sure
that DCAM covers every single metadata possibility, known today or to be
discovered in the future. That would probably prevent DCAM from have
such specifics as "property URIs" or "literal values." It's going to be
hard enough to come up with a model that functions well within the
semantic web universe.
kc
> Whether or not that model results in Linked Data is beside the point, isn't
> it?
>
> Jon,
> who just found this and had to paste it here:
>
> Endless invention, endless experiment,
> Brings knowledge of motion, but not of stillness;
> Knowledge of speech, but not of silence;
> Knowledge of words, and ignorance of the Word...
>
> Where is the Life we have lost in living?
> Where is the wisdom we have lost in knowledge?
> Where is the knowledge we have lost in information?
> -- T. S. Eliot, Choruses from 'The Rock'
>
>
> On Wed, Feb 15, 2012 at 11:26 AM, Karen Coyle<[log in to unmask]> wrote:
>
>> Hmm. In terms of analogies, I would equate DCAP with a data dictionary,
>> not DCAM. To me a data dictionary is the actual metadata elements you will
>> use, not an abstract definition of the possible structures. DCAM seems to
>> be closer to the idea of "design patterns."
>>
>> I don't see how something can be linked data if it doesn't have certain
>> characteristics:
>> - http uris
>> - subjects, predicates, objects (whether serialized as triples or not, and
>> RDF/XML and turtle are examples of not)
>> - subjects and predicates constrained as URIs; objects constrained
>> differently (which DCAM would address)
>>
>> It's possible that the JSON examples in that blog post met these criteria
>> (I didn't perceive URIs for the predicates, but maybe I don't read JSON
>> well).
>>
>> kc
>>
>>
>> On 2/15/12 7:58 AM, Jon Phipps wrote:
>>
>>> Re: "Underneath it all you still have to have something that expresses
>>> valid triples, n'est pas?"
>>>
>>> Actually, my point here is that there are many data serializations, models
>>> and use cases for creating, validating, and distributing metadata and many
>>> of them don't include a notion of triples, (e.g. nosql) although many of
>>> them do include a notion of domain-specific validity and some form of
>>> distribution. RDF is extremely useful for distributing metadata in an Open
>>> World context, but it's hardly the only data model and hardly the only
>>> method of distributing useful metadata.
>>>
>>> We need to provide, or at least try to provide, a specification that makes
>>> it possible for an organization to describe how they expect the 'things'
>>> they know about to be described: which properties are valid or not, what
>>> constitutes valid data, and what does each property mean. In the old days,
>>> this model used to be called a 'data dictionary' and it's an incredibly
>>> useful concept in a world of distributed heterogeneous data. Providing a
>>> way for someone to create a single 'data dictionary' that can be used
>>> (preferably by a machine) to create validations for domain-specific data
>>> and that can be used by anyone (preferably a machine) in the organization,
>>> or alternatively in the world, to understand the meaning of that data
>>> across departmental, organizational, or national boundaries would be
>>> incredibly and fundamentally useful.
>>>
>>> If we say that RDF is the ONLY useful way to do this, then we might as
>>> well
>>> go back to "DCAM is just RDF".
>>>
>>> Jon
>>>
>>> I check email just a couple of times daily; to reach me sooner, click
>>> here:
>>> http://awayfind.com/jonphipps
>>>
>>>
>>> On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle<[log in to unmask]> wrote:
>>>
>>> What *does* seem to be core in this blog post is the use of http URIs for
>>>> values. I'd add to that: properties defined with http URIs, so you know
>>>> what you are describing. Although you can serialize all of this in JSON
>>>> if
>>>> you wish, it means that you have started with LD concepts, not the usual
>>>> JSON application. Underneath it all you still have to have something that
>>>> expresses valid triples, n'est pas?
>>>>
>>>> kc
>>>>
>>>>
>>>> On 2/15/12 5:49 AM, Jon Phipps wrote:
>>>>
>>>> I've been doing some wandering around in JSON land for the last few days
>>>>> and, as part of a continuing observation that RDF is an implementation
>>>>> detail rather than a core requirement, I'd like to point to this post
>>>>> from
>>>>> James Snell
>>>>> http://chmod777self.blogspot.****com/2012/02/mostly-linked-****
>>>>> data.html<http://chmod777self.**blogspot.com/2012/02/mostly-**
>>>>> linked-data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>
>>>>>>
>>>>>
>>>>> And the JSON Scema spec: http://json-schema.org/
>>>>>
>>>>> Jon,
>>>>> who may someday get his act together and pay attention to these meetings
>>>>> more than a couple of hours before the meeting.
>>>>>
>>>>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]> wrote:
>>>>>
>>>>> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:
>>>>>>
>>>>>> -- that DCAM should be developed using a test-driven approach, with
>>>>>>> effective examples and test cases that can be expressed in various
>>>>>>> concrete syntaxes.
>>>>>>>
>>>>>>>
>>>>>> Jon suggested that we take Gordon's requirements for metadata record
>>>>>>
>>>>>> constructs
>>>>>
>>>>> [1] as a starting point. As I understand them, these are:
>>>>>>
>>>>>> -- the ability to encode multicomponent things (which in the
>>>>>> cataloging
>>>>>> world happen to be called "statements", as in "publication
>>>>>> statement"
>>>>>> and "classification statement") either:
>>>>>>
>>>>>> -- as unstructured strings, or
>>>>>> -- as strings structured according to a named Syntax Encoding
>>>>>> Scheme,
>>>>>>
>>>>>> or
>>>>>
>>>>> -- as Named Graphs with individual component triples
>>>>>>
>>>>>> -- the ability to express the repeatability of components in such
>>>>>>
>>>>>> "statements"
>>>>>
>>>>>
>>>>>> -- the ability to designate properties as "mandatory", or "mandatory
>>>>>> if
>>>>>> applicable", and the like
>>>>>>
>>>>>> -- the ability to constrain the cardinality of "subsets of properties"
>>>>>> within a particular context, such as the FRBR model
>>>>>>
>>>>>> -- the ability to express mappings between properties in different
>>>>>>
>>>>>> namespaces.
>>>>>
>>>>>
>>>>>> It has also been suggested that we find examples of real metadata
>>>>>> instance
>>>>>> records from different communities and contexts -- e.g., libraries,
>>>>>>
>>>>>> government,
>>>>>
>>>>> industry, and biomed -- for both testing and illustrating DCAM
>>>>>> constracts.
>>>>>>
>>>>>> Tom
>>>>>>
>>>>>> [1]
>>>>>>
>>>>>> https://www.jiscmail.ac.uk/****cgi-bin/webadmin?A2=ind1202&L=****<https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**>
>>>>> dc-architecture&P=6405<https:/**/www.jiscmail.ac.uk/cgi-bin/**
>>>>> webadmin?A2=ind1202&L=dc-**architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>
>>>>>>
>>>>>
>>>>>
>>>>>> --
>>>>>> Tom Baker<[log in to unmask]>
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>> Karen Coyle
>>>> [log in to unmask] http://kcoyle.net
>>>> ph: 1-510-540-7596
>>>> m: 1-510-435-8234
>>>> skype: kcoylenet
>>>>
>>>>
>>>
>> --
>> Karen Coyle
>> [log in to unmask] http://kcoyle.net
>> ph: 1-510-540-7596
>> m: 1-510-435-8234
>> skype: kcoylenet
>>
>
--
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
------------------------------
Date: Wed, 15 Feb 2012 14:06:13 -0800
From: Karen Coyle <[log in to unmask]>
Subject: Re: DCAM - collecting requirements and examples
Oh, I meant to follow Jon's quoted poem with some words from Frank Zappa:
Information is not knowledge.
Knowledge is not wisdom.
Wisdom is not truth.
Truth is not beauty.
Beauty is not love.
Love is not music.
Music is the best.
On 2/15/12 12:30 PM, Jon Phipps wrote:
> Are we only talking about Linked Data or are we talking about information
> modeling? DCAP is a documentation model for describing an information
> ecosystem and DCAM is its formal abstract 'domain' model, or should be.
> Whether or not that model results in Linked Data is beside the point, isn't
> it?
>
> Jon,
> who just found this and had to paste it here:
>
> Endless invention, endless experiment,
> Brings knowledge of motion, but not of stillness;
> Knowledge of speech, but not of silence;
> Knowledge of words, and ignorance of the Word...
>
> Where is the Life we have lost in living?
> Where is the wisdom we have lost in knowledge?
> Where is the knowledge we have lost in information?
> -- T. S. Eliot, Choruses from 'The Rock'
>
>
> On Wed, Feb 15, 2012 at 11:26 AM, Karen Coyle<[log in to unmask]> wrote:
>
>> Hmm. In terms of analogies, I would equate DCAP with a data dictionary,
>> not DCAM. To me a data dictionary is the actual metadata elements you will
>> use, not an abstract definition of the possible structures. DCAM seems to
>> be closer to the idea of "design patterns."
>>
>> I don't see how something can be linked data if it doesn't have certain
>> characteristics:
>> - http uris
>> - subjects, predicates, objects (whether serialized as triples or not, and
>> RDF/XML and turtle are examples of not)
>> - subjects and predicates constrained as URIs; objects constrained
>> differently (which DCAM would address)
>>
>> It's possible that the JSON examples in that blog post met these criteria
>> (I didn't perceive URIs for the predicates, but maybe I don't read JSON
>> well).
>>
>> kc
>>
>>
>> On 2/15/12 7:58 AM, Jon Phipps wrote:
>>
>>> Re: "Underneath it all you still have to have something that expresses
>>> valid triples, n'est pas?"
>>>
>>> Actually, my point here is that there are many data serializations, models
>>> and use cases for creating, validating, and distributing metadata and many
>>> of them don't include a notion of triples, (e.g. nosql) although many of
>>> them do include a notion of domain-specific validity and some form of
>>> distribution. RDF is extremely useful for distributing metadata in an Open
>>> World context, but it's hardly the only data model and hardly the only
>>> method of distributing useful metadata.
>>>
>>> We need to provide, or at least try to provide, a specification that makes
>>> it possible for an organization to describe how they expect the 'things'
>>> they know about to be described: which properties are valid or not, what
>>> constitutes valid data, and what does each property mean. In the old days,
>>> this model used to be called a 'data dictionary' and it's an incredibly
>>> useful concept in a world of distributed heterogeneous data. Providing a
>>> way for someone to create a single 'data dictionary' that can be used
>>> (preferably by a machine) to create validations for domain-specific data
>>> and that can be used by anyone (preferably a machine) in the organization,
>>> or alternatively in the world, to understand the meaning of that data
>>> across departmental, organizational, or national boundaries would be
>>> incredibly and fundamentally useful.
>>>
>>> If we say that RDF is the ONLY useful way to do this, then we might as
>>> well
>>> go back to "DCAM is just RDF".
>>>
>>> Jon
>>>
>>> I check email just a couple of times daily; to reach me sooner, click
>>> here:
>>> http://awayfind.com/jonphipps
>>>
>>>
>>> On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle<[log in to unmask]> wrote:
>>>
>>> What *does* seem to be core in this blog post is the use of http URIs for
>>>> values. I'd add to that: properties defined with http URIs, so you know
>>>> what you are describing. Although you can serialize all of this in JSON
>>>> if
>>>> you wish, it means that you have started with LD concepts, not the usual
>>>> JSON application. Underneath it all you still have to have something that
>>>> expresses valid triples, n'est pas?
>>>>
>>>> kc
>>>>
>>>>
>>>> On 2/15/12 5:49 AM, Jon Phipps wrote:
>>>>
>>>> I've been doing some wandering around in JSON land for the last few days
>>>>> and, as part of a continuing observation that RDF is an implementation
>>>>> detail rather than a core requirement, I'd like to point to this post
>>>>> from
>>>>> James Snell
>>>>> http://chmod777self.blogspot.****com/2012/02/mostly-linked-****
>>>>> data.html<http://chmod777self.**blogspot.com/2012/02/mostly-**
>>>>> linked-data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>
>>>>>>
>>>>>
>>>>> And the JSON Scema spec: http://json-schema.org/
>>>>>
>>>>> Jon,
>>>>> who may someday get his act together and pay attention to these meetings
>>>>> more than a couple of hours before the meeting.
>>>>>
>>>>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]> wrote:
>>>>>
>>>>> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:
>>>>>>
>>>>>> -- that DCAM should be developed using a test-driven approach, with
>>>>>>> effective examples and test cases that can be expressed in various
>>>>>>> concrete syntaxes.
>>>>>>>
>>>>>>>
>>>>>> Jon suggested that we take Gordon's requirements for metadata record
>>>>>>
>>>>>> constructs
>>>>>
>>>>> [1] as a starting point. As I understand them, these are:
>>>>>>
>>>>>> -- the ability to encode multicomponent things (which in the
>>>>>> cataloging
>>>>>> world happen to be called "statements", as in "publication
>>>>>> statement"
>>>>>> and "classification statement") either:
>>>>>>
>>>>>> -- as unstructured strings, or
>>>>>> -- as strings structured according to a named Syntax Encoding
>>>>>> Scheme,
>>>>>>
>>>>>> or
>>>>>
>>>>> -- as Named Graphs with individual component triples
>>>>>>
>>>>>> -- the ability to express the repeatability of components in such
>>>>>>
>>>>>> "statements"
>>>>>
>>>>>
>>>>>> -- the ability to designate properties as "mandatory", or "mandatory
>>>>>> if
>>>>>> applicable", and the like
>>>>>>
>>>>>> -- the ability to constrain the cardinality of "subsets of properties"
>>>>>> within a particular context, such as the FRBR model
>>>>>>
>>>>>> -- the ability to express mappings between properties in different
>>>>>>
>>>>>> namespaces.
>>>>>
>>>>>
>>>>>> It has also been suggested that we find examples of real metadata
>>>>>> instance
>>>>>> records from different communities and contexts -- e.g., libraries,
>>>>>>
>>>>>> government,
>>>>>
>>>>> industry, and biomed -- for both testing and illustrating DCAM
>>>>>> constracts.
>>>>>>
>>>>>> Tom
>>>>>>
>>>>>> [1]
>>>>>>
>>>>>> https://www.jiscmail.ac.uk/****cgi-bin/webadmin?A2=ind1202&L=****<https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**>
>>>>> dc-architecture&P=6405<https:/**/www.jiscmail.ac.uk/cgi-bin/**
>>>>> webadmin?A2=ind1202&L=dc-**architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>
>>>>>>
>>>>>
>>>>>
>>>>>> --
>>>>>> Tom Baker<[log in to unmask]>
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>> Karen Coyle
>>>> [log in to unmask] http://kcoyle.net
>>>> ph: 1-510-540-7596
>>>> m: 1-510-435-8234
>>>> skype: kcoylenet
>>>>
>>>>
>>>
>> --
>> Karen Coyle
>> [log in to unmask] http://kcoyle.net
>> ph: 1-510-540-7596
>> m: 1-510-435-8234
>> skype: kcoylenet
>>
>
--
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
------------------------------
End of DC-ARCHITECTURE Digest - 14 Feb 2012 to 15 Feb 2012 (#2012-23)
*********************************************************************
|