JISCMail - DC-ARCHITECTURE Archives

Email discussion lists for the UK Education and Research communities
Subscriber's Corner
Email Lists
DC-ARCHITECTURE Archives

DC-ARCHITECTURE@JISCMAIL.AC.UK

View:

Message:
[
First
Last
]
By Topic:
[
First
Last
]
By Author:
[
First
Last
]
Font:
Proportional Font
		LISTSERV Archives
		DC-ARCHITECTURE Home
		DC-ARCHITECTURE March 2012
Options

Subscribe or Unsubscribe
Get Password
Subject:
Re: DC-ARCHITECTURE Digest - 14 Feb 2012 to 15 Feb 2012 (#2012-23)
From:
"Bombardier, Kevin C" <[log in to unmask]>
Reply-To:
DCMI Architecture Forum <[log in to unmask]>
Date:
Tue, 13 Mar 2012 08:13:51 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (1 lines)
Please update my email address to be [log in to unmask]



Thanks



-----Original Message-----

From: DCMI Architecture Forum [mailto:[log in to unmask]] On Behalf Of DC-ARCHITECTURE automatic digest system

Sent: Wednesday, February 15, 2012 7:03 PM

To: [log in to unmask]

Subject: DC-ARCHITECTURE Digest - 14 Feb 2012 to 15 Feb 2012 (#2012-23)



There are 14 messages totaling 2954 lines in this issue.



Topics of the day:



  1. DCAM - where we stand (2)

  2. DCAM telecon - additional links to Gap Analysis and Son of Dublin Core

  3. DCAM - collecting requirements and examples (9)

  4. Just some food for thought... (2)



----------------------------------------------------------------------



Date:    Wed, 15 Feb 2012 10:12:45 +0000

From:    "Greenberg, Jane" <[log in to unmask]>

Subject: Re: DCAM - where we stand



Tom, all ...







I really like this initial stab at a general message, and my sense is that the use of the word 'slots' will resonate with folks.   At least this is what I think at the moment, and it made the description easy to understand for me.  This is what is needed in the user-facing documentation and can reach beyond those immersed in DCAM..and not scare those who are new to this information.







Two brief comments --



~ In sentence one, I wanted to say "connected" slots, but I'm not sure it's necessary.  The indication of a 'defined structure' at the end can stand for this.  I'm just thinking that is it the linking (connecting) of slots that is key.







~  Perhaps a given, but I think it would be good to list a few examples beyond book.  My bias perhaps, but could we list things like a dataset, and image, and I would like to list person and event?  (Do 'Things' like person or event cause a problem?  They may be seen as unorthodox in DCAM... I'm not sure, or in conflict w/ FRBR?)







Best wishes, jane







-----Original Message-----

From: DCMI Architecture Forum [mailto:[log in to unmask]] On Behalf Of Thomas Baker

Sent: Wednesday, February 15, 2012 12:05 AM

To: [log in to unmask]

Subject: Re: DCAM - where we stand







On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:



> And there was some support for the idea that a clear "punchline" for



> DCAM, along the lines of the one-liner summarizing SKOS in plain



> language, would be helpful.







A bit longer than the one-liner for SKOS, but here's my stab at a "general message" for DCAM:







    Seen as data artifacts, Metadata Records consist of slots holding



    information items in a defined structure.  A Metadata Record may describe a



    single Thing of interest (such as a Book) or a cluster of closely related



    Things (such as a Book and its Author).  More abstractly, a Metadata Record



    may be seen as a Description Set encompassing just one Description (i.e.,



    about the Book) or multiple Descriptions (about both the Book and the



    Author).







    A Description consists of one or more Statements about the Thing Described



    (e.g., stating the Name and Birthdate of an Author).  The Thing Described



    by a Description may be identified using a URI.  A Statement about the



    Thing Described has one slot for an Attribute (Property) and one slot for a



    Value.  Attribute slots are filled with names of attributes (properties);



    in DCAM, attributes are "named" using URIs.  Value slots are filled with



    Value Strings, URIs, or blank Value Placeholders.  A Value String may be



    stated as belonging to a named set of strings (known as a Syntax Encoding



    Scheme).  A Value URI may be stated as belonging to a named set of URIs



    (known as a Vocabulary Encoding Scheme).  In practice, Statements may be



    viewed in the context of Statement Sets.  Statement Sets may follow common



    patterns.







    The Dublin Core Abstract Model (DCAM) provides a language for representing



    the structure of specific Metadata Records -- put more abstractly, to



    specify a Description Set Profile -- in a form that is independent of



    particular Concrete Encoding Technologies such as XML Schema, RDF/XML,



    RelaxNG, relational databases, Schematron, or JSON.







    In order to provide compatibility with Semantic Web and Linked Data



    applications, however, DCAM is fully aligned with the Model and Abstract



    Syntax of RDF.  (Note that the RDF abstract model is the basis for -- thus



    distinct from -- concrete RDF encoding technologies such as RDF/XML,



   N-Triples, and Turtle.) Knowledge of RDF is not a prerequisite for



    understanding DCAM on an informal level.







    DCAM provides a language for expressing common patterns of Statements --



    patterns that may be partially or fully encoded using specific Concrete



    Encoding Technologies.  Indeed, some readers may find the example patterns



    used in designing DCAM more accessible and useful, as models and templates



    for implementation, than the formal specification of DCAM itself.







Details aside, this text illustrates the sort of high-level description I think we would need to have as an explanation both to our intended audience -- and to guide ourselves in the design phase.  I'm not sure whether the mixing of references to syntax ("slots", "Value URI") and semantics ("Thing Described") in this draft is a bug -- or a feature.  I also wonder how DCAM can close the gap to expressing the constraints of real application profiles without introducing DC-DSP-like notions such as "templates" and "constraints" [1].







For discussion...







Tom







[1] http://dublincore.org/documents/dc-dsp/







--



Tom Baker <[log in to unmask]<mailto:[log in to unmask]>>



------------------------------



Date:    Wed, 15 Feb 2012 08:34:54 -0500

From:    Jon Phipps <[log in to unmask]>

Subject: Re: DCAM telecon - additional links to Gap Analysis and Son of Dublin Core



I was always very impressed by SoDC. It's incomplete, but the notion of a

concrete syntax that can be used to express semantics in one branch and

constraints in another with an exemplar processing facility has always made

enormous sense to me and would seem to be a good use case for a DCAP/DCAM

spec.



Jon



On Tuesday, February 14, 2012, Thomas Baker <[log in to unmask]> wrote:

> On Tue, Feb 14, 2012 at 01:32:58PM -0500, Tom Baker wrote:

>> Date:       2012-02-15 Wednesday, 1100 EST

>> Expected:   Tom Baker (chair), Mark Matienzo, Antoine Isaac, Stuart

Sutton, Aaron Rubinstein,

>>             Phipps, Gordon Dunsire, Kai Eckert

>> Regrets:    Corey Harper, Richard Urban

>

> On tomorrow's call I'd like to continue the discussion, even though

> we will have some important absences.

>

> I noticed just after posting that I hadn't added the link to the

(beginnings of

> a) gap analysis [1].

>

> At Corey's request, I included a link to Alistair Miles's Son of Dublin

Core

> (below).

>

> Tom

>

> [1] http://wiki.dublincore.org/index.php/DCAM_Revision_Gap_Analysis

>

>> -- Alistair Miles's Son of Dublin Core (SoDC)

>>    http://aliman.googlecode.com/svn/trunk/sodc/SoDC-0.2/index.html

>>

http://aliman.googlecode.com/svn/trunk/sodc/SoDC-0.2/release/SoDC-0_2.zip -

everything, zipped

>

> --

> Tom Baker <[log in to unmask]>

>



--

Jon



I check email just a couple of times daily; to reach me sooner, click here:

http://awayfind.com/jonphipps



------------------------------



Date:    Wed, 15 Feb 2012 08:49:32 -0500

From:    Jon Phipps <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



I've been doing some wandering around in JSON land for the last few days

and, as part of a continuing observation that RDF is an implementation

detail rather than a core requirement, I'd like to point to this post from

James Snell

http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html

And the JSON Scema spec: http://json-schema.org/



Jon,

who may someday get his act together and pay attention to these meetings

more than a couple of hours before the meeting.



On Tuesday, February 14, 2012, Thomas Baker <[log in to unmask]> wrote:

> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>> --  that DCAM should be developed using a test-driven approach, with

>>     effective examples and test cases that can be expressed in various

>>     concrete syntaxes.

>

> Jon suggested that we take Gordon's requirements for metadata record

constructs

> [1] as a starting point.  As I understand them, these are:

>

> --  the ability to encode multicomponent things (which in the cataloging

>    world happen to be called "statements", as in "publication statement"

>    and "classification statement") either:

>

>    -- as unstructured strings, or

>    -- as strings structured according to a named Syntax Encoding Scheme,

or

>    -- as Named Graphs with individual component triples

>

> --  the ability to express the repeatability of components in such

"statements"

>

> --  the ability to designate properties as "mandatory", or "mandatory if

>    applicable", and the like

>

> --  the ability to constrain the cardinality of "subsets of properties"

>    within a particular context, such as the FRBR model

>

> -- the ability to express mappings between properties in different

namespaces.

>

> It has also been suggested that we find examples of real metadata instance

> records from different communities and contexts -- e.g., libraries,

government,

> industry, and biomed -- for both testing and illustrating DCAM constracts.

>

> Tom

>

> [1]

https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405

>

> --

> Tom Baker <[log in to unmask]>

>



--

Jon



I check email just a couple of times daily; to reach me sooner, click here:

http://awayfind.com/jonphipps



------------------------------



Date:    Wed, 15 Feb 2012 06:47:02 -0800

From:    Karen Coyle <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



What *does* seem to be core in this blog post is the use of http URIs

for values. I'd add to that: properties defined with http URIs, so you

know what you are describing. Although you can serialize all of this in

JSON if you wish, it means that you have started with LD concepts, not

the usual JSON application. Underneath it all you still have to have

something that expresses valid triples, n'est pas?



kc



On 2/15/12 5:49 AM, Jon Phipps wrote:

> I've been doing some wandering around in JSON land for the last few days

> and, as part of a continuing observation that RDF is an implementation

> detail rather than a core requirement, I'd like to point to this post from

> James Snell

> http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html

> And the JSON Scema spec: http://json-schema.org/

>

> Jon,

> who may someday get his act together and pay attention to these meetings

> more than a couple of hours before the meeting.

>

> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]>  wrote:

>> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>>> --  that DCAM should be developed using a test-driven approach, with

>>>      effective examples and test cases that can be expressed in various

>>>      concrete syntaxes.

>>

>> Jon suggested that we take Gordon's requirements for metadata record

> constructs

>> [1] as a starting point.  As I understand them, these are:

>>

>> --  the ability to encode multicomponent things (which in the cataloging

>>     world happen to be called "statements", as in "publication statement"

>>     and "classification statement") either:

>>

>>     -- as unstructured strings, or

>>     -- as strings structured according to a named Syntax Encoding Scheme,

> or

>>     -- as Named Graphs with individual component triples

>>

>> --  the ability to express the repeatability of components in such

> "statements"

>>

>> --  the ability to designate properties as "mandatory", or "mandatory if

>>     applicable", and the like

>>

>> --  the ability to constrain the cardinality of "subsets of properties"

>>     within a particular context, such as the FRBR model

>>

>> -- the ability to express mappings between properties in different

> namespaces.

>>

>> It has also been suggested that we find examples of real metadata instance

>> records from different communities and contexts -- e.g., libraries,

> government,

>> industry, and biomed -- for both testing and illustrating DCAM constracts.

>>

>> Tom

>>

>> [1]

> https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405

>>

>> --

>> Tom Baker<[log in to unmask]>

>>

>



--

Karen Coyle

[log in to unmask] http://kcoyle.net

ph: 1-510-540-7596

m: 1-510-435-8234

skype: kcoylenet



------------------------------



Date:    Wed, 15 Feb 2012 09:38:37 -0500

From:    Richard Urban <[log in to unmask]>

Subject: Re: DCAM - where we stand



Hi everyone,



Sorry I won't be able to join the call today, so I thought I'd send a few comments to the list.



Tom,  I believe "slots" are a new introduction to the DCAM (at least from my reading),  but it does seem to introduce the possibility of confusion with how "slots" are used in frame-based languages (http://en.wikipedia.org/wiki/Frame_language).   Is the concept of slots we are introducing here equivalent to those kinds of slots or are we introducing a DCAM specific concept?  If the latter,  it may help to spell out what the features of these slots are (beyond properties/values).



> I'm not sure whether the mixing of

> references to syntax ("slots", "Value URI") and semantics ("Thing Described")

> in this draft is a bug -- or a feature.



So "slots" are a syntactical concept?  If we are basing DCAM on the RDF model, it seems that "semantics" also incorporates the abstract construction of grammatical features like "statements."   I would therefore expect "slots" to be at that abstract level.



Following that thread, are DCAM Statements still just be property/value slots, or are they now more like a triple, with a (for lack of a better term) slot that holds a URI that refers to the Thing Described?   An important part of preserving intuitive sense of colloquial records is that such URIs are optional.  I think this is equivalent to RDFs concept of blank nodes, but I don't think that connection has been explicitly drawn in DCAM. (Kai?)  If DCAM is standing slightly apart from RDF to accomodate colloquial XML records[1] , does our sense of statements/described resource URIs align with this concept?  The Linked Data community is discouraging blank nodes, so we could imagine a DCAM that also requires URIs for all Things.  (I suspect that this is not what we want to do, but putting it out there for discussion).  (see "In order to provide compatibility with Semantic Web and Linked Data applications, however, DCAM is fully aligned with the Model and Abstract Syntax of RDF").   It also seems that Linked Data's requirement to including Thing URIs is the kind of constraints that we might be modeling at the level of DCAM and might afford some useful real-world examples of how it's done.  ( how is this similar/different to the kinds of constraints we need to express at the DSP level?)



Lastly,  since we are also trying to work backwards from XML,  we frequently using the term "records" in discussing DCAM. i.e.



>  The Dublin Core Abstract Model (DCAM) provides a language for representing

>    the structure of specific Metadata Records -- put more abstractly, to

>    specify a Description Set Profile -- in a form that is independent of

>    particular Concrete Encoding Technologies such as XML Schema, RDF/XML,

>    RelaxNG, relational databases, Schematron, or JSON.





I don't think DCAM is after Description Set Profiles directly,  rather it models our intuitive sense of "records" as an abstract "Description Set." (which then enables us to specify a DSP).   I think it does pretty well at this, but I'm curious if there are objections to "Description Sets" that should go into our gap analysis.  Jon et al. are there things about the kinds of records you work with that don't fit the current DCAM model?



Richard J. Urban, Visiting Professor

School of Library and Information Studies

College of Communication and Information

Florida State University

[log in to unmask]

@musebrarian





[1] Sperberg-McQueen,  C.M. and Miller, E. On mapping from colloquial XML to RDF using XSLT.  Proceedings of Extreme Markup Languages 2004. http://conferences.idealliance.org/extreme/html/2004/Sperberg-McQueen01/EML2004Sperberg-McQueen01.html



On Feb 14, 2012, at 6:04 PM, Thomas Baker wrote:



> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>> And there was some support for the idea that a clear "punchline" for

>> DCAM, along the lines of the one-liner summarizing SKOS in plain language,

>> would be helpful.

>

> A bit longer than the one-liner for SKOS, but here's my stab at

> a "general message" for DCAM:

>

>    Seen as data artifacts, Metadata Records consist of slots holding

>    information items in a defined structure.  A Metadata Record may describe a

>    single Thing of interest (such as a Book) or a cluster of closely related

>    Things (such as a Book and its Author).  More abstractly, a Metadata Record

>    may be seen as a Description Set encompassing just one Description (i.e.,

>    about the Book) or multiple Descriptions (about both the Book and the

>    Author).

>

>    A Description consists of one or more Statements about the Thing Described

>    (e.g., stating the Name and Birthdate of an Author).  The Thing Described

>    by a Description may be identified using a URI.  A Statement about the

>    Thing Described has one slot for an Attribute (Property) and one slot for a

>    Value.  Attribute slots are filled with names of attributes (properties);

>    in DCAM, attributes are "named" using URIs.  Value slots are filled with

>    Value Strings, URIs, or blank Value Placeholders.  A Value String may be

>    stated as belonging to a named set of strings (known as a Syntax Encoding

>    Scheme).  A Value URI may be stated as belonging to a named set of URIs

>    (known as a Vocabulary Encoding Scheme).  In practice, Statements may be

>    viewed in the context of Statement Sets.  Statement Sets may follow common

>    patterns.

>

>    The Dublin Core Abstract Model (DCAM) provides a language for representing

>    the structure of specific Metadata Records -- put more abstractly, to

>    specify a Description Set Profile -- in a form that is independent of

>    particular Concrete Encoding Technologies such as XML Schema, RDF/XML,

>    RelaxNG, relational databases, Schematron, or JSON.

>

>    In order to provide compatibility with Semantic Web and Linked Data

>    applications, however, DCAM is fully aligned with the Model and Abstract

>    Syntax of RDF.  (Note that the RDF abstract model is the basis for -- thus

>    distinct from -- concrete RDF encoding technologies such as RDF/XML,

>    N-Triples, and Turtle.) Knowledge of RDF is not a prerequisite for

>    understanding DCAM on an informal level.

>

>    DCAM provides a language for expressing common patterns of Statements --

>    patterns that may be partially or fully encoded using specific Concrete

>    Encoding Technologies.  Indeed, some readers may find the example patterns

>    used in designing DCAM more accessible and useful, as models and templates

>    for implementation, than the formal specification of DCAM itself.

>

> Details aside, this text illustrates the sort of high-level description I think

> we would need to have as an explanation both to our intended audience -- and to

> guide ourselves in the design phase.  I'm not sure whether the mixing of

> references to syntax ("slots", "Value URI") and semantics ("Thing Described")

> in this draft is a bug -- or a feature.  I also wonder how DCAM can close the

> gap to expressing the constraints of real application profiles without

> introducing DC-DSP-like notions such as "templates" and "constraints" [1].

>

> For discussion...

>

> Tom

>

> [1] http://dublincore.org/documents/dc-dsp/

>

> --

> Tom Baker <[log in to unmask]>

>



------------------------------



Date:    Wed, 15 Feb 2012 09:50:03 -0500

From:    Richard Urban <[log in to unmask]>

Subject: Re: Just some food for thought...



Cory/Karen,



Are there any good summaries of the conversations in Seattle (relevant to this discussion) for those of us who didn't make it to #c4lib?



Thanks,

Richard



On Feb 5, 2012, at 3:25 PM, Karen Coyle wrote:



> On 2/2/12 10:14 AM, Corey A Harper wrote:

>

>>

>> I'm open to other suggestions about where we can reach out to for some

>> additional perspective.

>

> It's not only a matter of "where" it's a matter of "how." We've all been in on the lengthy conversations about terminology (and some of us went through that again at length at a meeting in Seattle last week). You can't expect much when you invite Russian speakers to a discussion taking place only in Latin. The DCAM terminology is a barrier. You can claim that

> 1) that terminology is necessary

> 2) people need to make the effort to learn it

>

> but that approach may not lead to success, as I believe is the case with the current version of DCAM. Reaching out should mean at least meeting people half way and doing all that is possible to bring them along. "Sink or swim" isn't an invitation.

>

> I actually believe that the utility of DCAM must be and can be expressed in terms of things people know and need to accomplish in their own environments. Examples and use cases will be a big help. That may even been a good place to start on this "round 2" effort: looking at what DCAM gives us as practitioners could reveal what else is needed, if anything, from such a model.

>

> kc

>

>>

>> On Thu, Feb 2, 2012 at 8:17 AM, Bruce D'Arcus<[log in to unmask]>  wrote:

>>> On Thu, Feb 2, 2012 at 10:14 AM, Jon Phipps<[log in to unmask]>  wrote:

>>>> This post represents an interesting perspective from the scientific data

>>>> community on some of the challenges to implementing semantic web solutions

>>>> and integrating them into existing system architectures and programming

>>>> models. This certainly looks to me like a place where the DCAP/DCAM

>>>> architecture coupled with some concrete implementation examples could be of

>>>> benefit...

>>>

>>> FWIW, I think the issue is much less about models than it is about the

>>> other stuff.

>>>

>>> Bruce

>>

>>

>>

>

> --

> Karen Coyle

> [log in to unmask] http://kcoyle.net

> ph: 1-510-540-7596

> m: 1-510-435-8234

> skype: kcoylenet



------------------------------



Date:    Wed, 15 Feb 2012 10:55:01 -0500

From:    Thomas Baker <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



On Wed, Feb 15, 2012 at 08:49:32AM -0500, Jon Phipps wrote:

> I've been doing some wandering around in JSON land for the last few days

> and, as part of a continuing observation that RDF is an implementation

> detail rather than a core requirement, I'd like to point to this post from

> James Snell

> http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html

> And the JSON Scema spec: http://json-schema.org/



It looks to me like he considers RDF to be a "format" and, as such,

comparable to JSON.  Commenting on [1], he writes:



    Reading on a little further, the document goes on to expand on that third

    point, "In order to enable a wide range of different applications to

    process Web content, it is important to agree on standardized content

    formats. The agreement on HTML as a dominant document format was an

    important factor that made the Web scale. The third Linked Data principle

    therefore advocates use of a single data model for publishing structured

    data on the Web â€“ the Resource Description Framework (RDF), a simple

    graph-based data model that has been designed for use in the context of the

    Web [70]. The RDF data model is explained in more detail later in this

    chapter."



    I can absolutely agree with the first part -- that standardized content

    formats are critical. But the "single data model" bit makes me twitch. We

    don't need a single data model.. what we need are common conventions for

    pulling out the bits of information we need regardless of the specific

    format used.



...i.e., in my reading, he is equating "data model" with a "specific format".

As I proposed yesterday, I think it is important to distinguish between RDF

"the model and abstract syntax" and RDF/XML "the concrete serialization syntax,

or format" -- not to mention other concrete RDF syntaxes such as N-Triples and

Turtle -- in DCAM's general message:



    The Dublin Core Abstract Model (DCAM) provides a language for representing

    the structure of specific Metadata Records -- put more abstractly, to

    specify a Description Set Profile -- in a form that is independent of

    particular Concrete Encoding Technologies such as XML Schema, RDF/XML,

    RelaxNG, relational databases, Schematron, or JSON.



    In order to provide compatibility with Semantic Web and Linked Data

    applications, however, DCAM is fully aligned with the Model and Abstract

    Syntax of RDF.  (Note that the RDF abstract model is the basis for -- thus

    distinct from -- concrete RDF encoding technologies such as RDF/XML,

    N-Triples, and Turtle.) Knowledge of RDF is not a prerequisite for

    understanding DCAM on an informal level.



It would help if we could agree on a way to characterize this distinction

(e.g., "Concrete Encoding Technologies" versus "Model and Abstract Syntax").



Unless I'm missing the point of his argument, I do not think James Snell is

proposing JSON Activity Streams as a generic abstract syntax -- something which

would compete with RDF as a "grammatical" basis for interoperability in Linked

Data.  He emphasizes his point that "If you're familiar with Activity Streams

and the linking extensions, then you'll know exactly what to do with this."

That seems consistent with what we want to do with DCAM -- with the added

distinction that if a JSON format is aligned with DCAM, and DCAM is aligned

with RDF, then one would in principle be able to express the contents of a JSON

format using an RDF concrete syntax.  Indeed, James's formulation that "what we

need are common conventions for pulling out the bits of information we need

regardless of the specific format used" could almost be used verbatim in a

description of the DCAM we are discussing.



Jon writes:

> as part of a continuing observation that RDF is an implementation

> detail rather than a core requirement...



I am coming around to the idea that DCAM (or at any rate, "DCAM 2") might be

presented informally without emphasizing RDF, and that some people might find

such a DCAM useful as a very high-level way to conceptualize metadata (i.e.,

Statements, composed of Slots for information and grouped into Descriptions and

Description Sets, following common design patterns, etc...) I still do not see

the value of specifying a DCAM that is anything less than perfectly aligned

with the RDF Model and Abstract Syntax.  That people may take inspiration from

such an RDF-grounded model, ignoring the RDF basis, is not something we should

worry about.  But RDF, such as it is, is the only common _grammatical_ basis

for data that we currently have, and not to ground DCAM in RDF would make it

useless for the purposes of RDF-based interoperability.



Tom



[1] http://linkeddatabook.com/editions/1.0/





--

Tom Baker <[log in to unmask]>



------------------------------



Date:    Wed, 15 Feb 2012 10:58:53 -0500

From:    Jon Phipps <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



Re: "Underneath it all you still have to have something that expresses

valid triples, n'est pas?"



Actually, my point here is that there are many data serializations, models

and use cases for creating, validating, and distributing metadata and many

of them don't include a notion of triples, (e.g. nosql) although many of

them do include a notion of domain-specific validity and some form of

distribution. RDF is extremely useful for distributing metadata in an Open

World context, but it's hardly the only data model and hardly the only

method of distributing useful metadata.



We need to provide, or at least try to provide, a specification that makes

it possible for an organization to describe how they expect the 'things'

they know about to be described: which properties are valid or not, what

constitutes valid data, and what does each property mean. In the old days,

this model used to be called a 'data dictionary' and it's an incredibly

useful concept in a world of distributed heterogeneous data. Providing a

way for someone to create a single 'data dictionary' that can be used

(preferably by a machine) to create validations for domain-specific data

and that can be used by anyone (preferably a machine) in the organization,

or alternatively in the world, to understand the meaning of that data

across departmental, organizational, or national boundaries would be

incredibly and fundamentally useful.



If we say that RDF is the ONLY useful way to do this, then we might as well

go back to "DCAM is just RDF".



Jon



I check email just a couple of times daily; to reach me sooner, click here:

http://awayfind.com/jonphipps





On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle <[log in to unmask]> wrote:



> What *does* seem to be core in this blog post is the use of http URIs for

> values. I'd add to that: properties defined with http URIs, so you know

> what you are describing. Although you can serialize all of this in JSON if

> you wish, it means that you have started with LD concepts, not the usual

> JSON application. Underneath it all you still have to have something that

> expresses valid triples, n'est pas?

>

> kc

>

>

> On 2/15/12 5:49 AM, Jon Phipps wrote:

>

>> I've been doing some wandering around in JSON land for the last few days

>> and, as part of a continuing observation that RDF is an implementation

>> detail rather than a core requirement, I'd like to point to this post from

>> James Snell

>> http://chmod777self.blogspot.**com/2012/02/mostly-linked-**data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>

>> And the JSON Scema spec: http://json-schema.org/

>>

>> Jon,

>> who may someday get his act together and pay attention to these meetings

>> more than a couple of hours before the meeting.

>>

>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]>  wrote:

>>

>>> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>>>

>>>> --  that DCAM should be developed using a test-driven approach, with

>>>>     effective examples and test cases that can be expressed in various

>>>>     concrete syntaxes.

>>>>

>>>

>>> Jon suggested that we take Gordon's requirements for metadata record

>>>

>> constructs

>>

>>> [1] as a starting point.  As I understand them, these are:

>>>

>>> --  the ability to encode multicomponent things (which in the cataloging

>>>    world happen to be called "statements", as in "publication statement"

>>>    and "classification statement") either:

>>>

>>>    -- as unstructured strings, or

>>>    -- as strings structured according to a named Syntax Encoding Scheme,

>>>

>> or

>>

>>>    -- as Named Graphs with individual component triples

>>>

>>> --  the ability to express the repeatability of components in such

>>>

>> "statements"

>>

>>>

>>> --  the ability to designate properties as "mandatory", or "mandatory if

>>>    applicable", and the like

>>>

>>> --  the ability to constrain the cardinality of "subsets of properties"

>>>    within a particular context, such as the FRBR model

>>>

>>> -- the ability to express mappings between properties in different

>>>

>> namespaces.

>>

>>>

>>> It has also been suggested that we find examples of real metadata

>>> instance

>>> records from different communities and contexts -- e.g., libraries,

>>>

>> government,

>>

>>> industry, and biomed -- for both testing and illustrating DCAM

>>> constracts.

>>>

>>> Tom

>>>

>>> [1]

>>>

>> https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**

>> dc-architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>

>>

>>>

>>> --

>>> Tom Baker<[log in to unmask]>

>>>

>>>

>>

> --

> Karen Coyle

> [log in to unmask] http://kcoyle.net

> ph: 1-510-540-7596

> m: 1-510-435-8234

> skype: kcoylenet

>



------------------------------



Date:    Wed, 15 Feb 2012 11:07:49 -0500

From:    Jon Phipps <[log in to unmask]>

Subject: Re: Just some food for thought...



On Sun, Feb 5, 2012 at 3:25 PM, Karen Coyle <[log in to unmask]> wrote:



> I actually believe that the utility of DCAM must be and can be expressed

> in terms of things people know and need to accomplish in their own

> environments. Examples and use cases will be a big help. That may even been

> a good place to start on this "round 2" effort: looking at what DCAM gives

> us as practitioners could reveal what else is needed, if anything, from

> such a model.





+1  :-)



------------------------------



Date:    Wed, 15 Feb 2012 08:26:12 -0800

From:    Karen Coyle <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



Hmm. In terms of analogies, I would equate DCAP with a data dictionary,

not DCAM. To me a data dictionary is the actual metadata elements you

will use, not an abstract definition of the possible structures. DCAM

seems to be closer to the idea of "design patterns."



I don't see how something can be linked data if it doesn't have certain

characteristics:

- http uris

- subjects, predicates, objects (whether serialized as triples or not,

and RDF/XML and turtle are examples of not)

- subjects and predicates constrained as URIs; objects constrained

differently (which DCAM would address)



It's possible that the JSON examples in that blog post met these

criteria (I didn't perceive URIs for the predicates, but maybe I don't

read JSON well).



kc



On 2/15/12 7:58 AM, Jon Phipps wrote:

> Re: "Underneath it all you still have to have something that expresses

> valid triples, n'est pas?"

>

> Actually, my point here is that there are many data serializations, models

> and use cases for creating, validating, and distributing metadata and many

> of them don't include a notion of triples, (e.g. nosql) although many of

> them do include a notion of domain-specific validity and some form of

> distribution. RDF is extremely useful for distributing metadata in an Open

> World context, but it's hardly the only data model and hardly the only

> method of distributing useful metadata.

>

> We need to provide, or at least try to provide, a specification that makes

> it possible for an organization to describe how they expect the 'things'

> they know about to be described: which properties are valid or not, what

> constitutes valid data, and what does each property mean. In the old days,

> this model used to be called a 'data dictionary' and it's an incredibly

> useful concept in a world of distributed heterogeneous data. Providing a

> way for someone to create a single 'data dictionary' that can be used

> (preferably by a machine) to create validations for domain-specific data

> and that can be used by anyone (preferably a machine) in the organization,

> or alternatively in the world, to understand the meaning of that data

> across departmental, organizational, or national boundaries would be

> incredibly and fundamentally useful.

>

> If we say that RDF is the ONLY useful way to do this, then we might as well

> go back to "DCAM is just RDF".

>

> Jon

>

> I check email just a couple of times daily; to reach me sooner, click here:

> http://awayfind.com/jonphipps

>

>

> On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle<[log in to unmask]>  wrote:

>

>> What *does* seem to be core in this blog post is the use of http URIs for

>> values. I'd add to that: properties defined with http URIs, so you know

>> what you are describing. Although you can serialize all of this in JSON if

>> you wish, it means that you have started with LD concepts, not the usual

>> JSON application. Underneath it all you still have to have something that

>> expresses valid triples, n'est pas?

>>

>> kc

>>

>>

>> On 2/15/12 5:49 AM, Jon Phipps wrote:

>>

>>> I've been doing some wandering around in JSON land for the last few days

>>> and, as part of a continuing observation that RDF is an implementation

>>> detail rather than a core requirement, I'd like to point to this post from

>>> James Snell

>>> http://chmod777self.blogspot.**com/2012/02/mostly-linked-**data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>

>>> And the JSON Scema spec: http://json-schema.org/

>>>

>>> Jon,

>>> who may someday get his act together and pay attention to these meetings

>>> more than a couple of hours before the meeting.

>>>

>>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]>   wrote:

>>>

>>>> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>>>>

>>>>> --  that DCAM should be developed using a test-driven approach, with

>>>>>      effective examples and test cases that can be expressed in various

>>>>>      concrete syntaxes.

>>>>>

>>>>

>>>> Jon suggested that we take Gordon's requirements for metadata record

>>>>

>>> constructs

>>>

>>>> [1] as a starting point.  As I understand them, these are:

>>>>

>>>> --  the ability to encode multicomponent things (which in the cataloging

>>>>     world happen to be called "statements", as in "publication statement"

>>>>     and "classification statement") either:

>>>>

>>>>     -- as unstructured strings, or

>>>>     -- as strings structured according to a named Syntax Encoding Scheme,

>>>>

>>> or

>>>

>>>>     -- as Named Graphs with individual component triples

>>>>

>>>> --  the ability to express the repeatability of components in such

>>>>

>>> "statements"

>>>

>>>>

>>>> --  the ability to designate properties as "mandatory", or "mandatory if

>>>>     applicable", and the like

>>>>

>>>> --  the ability to constrain the cardinality of "subsets of properties"

>>>>     within a particular context, such as the FRBR model

>>>>

>>>> -- the ability to express mappings between properties in different

>>>>

>>> namespaces.

>>>

>>>>

>>>> It has also been suggested that we find examples of real metadata

>>>> instance

>>>> records from different communities and contexts -- e.g., libraries,

>>>>

>>> government,

>>>

>>>> industry, and biomed -- for both testing and illustrating DCAM

>>>> constracts.

>>>>

>>>> Tom

>>>>

>>>> [1]

>>>>

>>> https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**

>>> dc-architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>

>>>

>>>>

>>>> --

>>>> Tom Baker<[log in to unmask]>

>>>>

>>>>

>>>

>> --

>> Karen Coyle

>> [log in to unmask] http://kcoyle.net

>> ph: 1-510-540-7596

>> m: 1-510-435-8234

>> skype: kcoylenet

>>

>



--

Karen Coyle

[log in to unmask] http://kcoyle.net

ph: 1-510-540-7596

m: 1-510-435-8234

skype: kcoylenet



------------------------------



Date:    Wed, 15 Feb 2012 18:30:09 +0100

From:    Kai Eckert <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



Hi all,



as a strong promoter for the RDF basis for DCAM, I would like to

emphasize, too, that RDF is only the formal model and should not be seen

as a concrete syntax. JSON is a syntax, it has no semantics. I like it

very much, and I like simple, pragmatic implementations, but that's not

what we need in our current context.



In the W3C provenance WG, we just had the experience, that it is much

easier to discuss a model that is defined in a formal language, in

contrast to plain English, which lead to endless discussions before. We

now focus on the formal PROV ontology, written in OWL, to reach a

consensus about the model. Additionally, we of course create documents

in plain English (at least) that hopefully explain and demonstrate what

can be done with the model. But these drafts can not be used to define

the model in the first place.



I think the only formal language that we all speak is {RDF,RDFS, OWL},

that's why I want to focus on the definition of everything that we are

talking about in DCAM with this language. In that respect, it is more a

side-effect that this would end in actually being RDF. If we face

limitations in this formal language that we can not accept, then of

course we should not restrict ourselves to RDF. But only then.



Cheers,



Kai





Am 15.02.2012 16:55, schrieb Thomas Baker:

> On Wed, Feb 15, 2012 at 08:49:32AM -0500, Jon Phipps wrote:

>> I've been doing some wandering around in JSON land for the last few days

>> and, as part of a continuing observation that RDF is an implementation

>> detail rather than a core requirement, I'd like to point to this post from

>> James Snell

>> http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html

>> And the JSON Scema spec: http://json-schema.org/

>

> It looks to me like he considers RDF to be a "format" and, as such,

> comparable to JSON.  Commenting on [1], he writes:

>

>      Reading on a little further, the document goes on to expand on that third

>      point, "In order to enable a wide range of different applications to

>      process Web content, it is important to agree on standardized content

>      formats. The agreement on HTML as a dominant document format was an

>      important factor that made the Web scale. The third Linked Data principle

>      therefore advocates use of a single data model for publishing structured

>      data on the Web â€“ the Resource Description Framework (RDF), a simple

>      graph-based data model that has been designed for use in the context of the

>      Web [70]. The RDF data model is explained in more detail later in this

>      chapter."

>

>      I can absolutely agree with the first part -- that standardized content

>      formats are critical. But the "single data model" bit makes me twitch. We

>      don't need a single data model.. what we need are common conventions for

>      pulling out the bits of information we need regardless of the specific

>      format used.

>

> ...i.e., in my reading, he is equating "data model" with a "specific format".

> As I proposed yesterday, I think it is important to distinguish between RDF

> "the model and abstract syntax" and RDF/XML "the concrete serialization syntax,

> or format" -- not to mention other concrete RDF syntaxes such as N-Triples and

> Turtle -- in DCAM's general message:

>

>      The Dublin Core Abstract Model (DCAM) provides a language for representing

>      the structure of specific Metadata Records -- put more abstractly, to

>      specify a Description Set Profile -- in a form that is independent of

>      particular Concrete Encoding Technologies such as XML Schema, RDF/XML,

>      RelaxNG, relational databases, Schematron, or JSON.

>

>      In order to provide compatibility with Semantic Web and Linked Data

>      applications, however, DCAM is fully aligned with the Model and Abstract

>      Syntax of RDF.  (Note that the RDF abstract model is the basis for -- thus

>      distinct from -- concrete RDF encoding technologies such as RDF/XML,

>      N-Triples, and Turtle.) Knowledge of RDF is not a prerequisite for

>      understanding DCAM on an informal level.

>

> It would help if we could agree on a way to characterize this distinction

> (e.g., "Concrete Encoding Technologies" versus "Model and Abstract Syntax").

>

> Unless I'm missing the point of his argument, I do not think James Snell is

> proposing JSON Activity Streams as a generic abstract syntax -- something which

> would compete with RDF as a "grammatical" basis for interoperability in Linked

> Data.  He emphasizes his point that "If you're familiar with Activity Streams

> and the linking extensions, then you'll know exactly what to do with this."

> That seems consistent with what we want to do with DCAM -- with the added

> distinction that if a JSON format is aligned with DCAM, and DCAM is aligned

> with RDF, then one would in principle be able to express the contents of a JSON

> format using an RDF concrete syntax.  Indeed, James's formulation that "what we

> need are common conventions for pulling out the bits of information we need

> regardless of the specific format used" could almost be used verbatim in a

> description of the DCAM we are discussing.

>

> Jon writes:

>> as part of a continuing observation that RDF is an implementation

>> detail rather than a core requirement...

>

> I am coming around to the idea that DCAM (or at any rate, "DCAM 2") might be

> presented informally without emphasizing RDF, and that some people might find

> such a DCAM useful as a very high-level way to conceptualize metadata (i.e.,

> Statements, composed of Slots for information and grouped into Descriptions and

> Description Sets, following common design patterns, etc...) I still do not see

> the value of specifying a DCAM that is anything less than perfectly aligned

> with the RDF Model and Abstract Syntax.  That people may take inspiration from

> such an RDF-grounded model, ignoring the RDF basis, is not something we should

> worry about.  But RDF, such as it is, is the only common _grammatical_ basis

> for data that we currently have, and not to ground DCAM in RDF would make it

> useless for the purposes of RDF-based interoperability.

>

> Tom

>

> [1] http://linkeddatabook.com/editions/1.0/

>

>



--

Kai Eckert

Universitätsbibliothek Mannheim

Stellv. Leiter Abteilung Digitale Bibliotheksdienste

Schloss Schneckhof West / 68131 Mannheim

Tel. 0621/181-2946 Fax 0621/181-2918



------------------------------



Date:    Wed, 15 Feb 2012 15:30:35 -0500

From:    Jon Phipps <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



Are we only talking about Linked Data or are we talking about information

modeling? DCAP is a documentation model for describing an information

ecosystem and DCAM is its formal abstract 'domain' model, or should be.

Whether or not that model results in Linked Data is beside the point, isn't

it?



Jon,

who just found this and had to paste it here:



Endless invention, endless experiment,

Brings knowledge of motion, but not of stillness;

Knowledge of speech, but not of silence;

Knowledge of words, and ignorance of the Word...



Where is the Life we have lost in living?

Where is the wisdom we have lost in knowledge?

Where is the knowledge we have lost in information?

 -- T. S. Eliot, Choruses from 'The Rock'





On Wed, Feb 15, 2012 at 11:26 AM, Karen Coyle <[log in to unmask]> wrote:



> Hmm. In terms of analogies, I would equate DCAP with a data dictionary,

> not DCAM. To me a data dictionary is the actual metadata elements you will

> use, not an abstract definition of the possible structures. DCAM seems to

> be closer to the idea of "design patterns."

>

> I don't see how something can be linked data if it doesn't have certain

> characteristics:

> - http uris

> - subjects, predicates, objects (whether serialized as triples or not, and

> RDF/XML and turtle are examples of not)

> - subjects and predicates constrained as URIs; objects constrained

> differently (which DCAM would address)

>

> It's possible that the JSON examples in that blog post met these criteria

> (I didn't perceive URIs for the predicates, but maybe I don't read JSON

> well).

>

> kc

>

>

> On 2/15/12 7:58 AM, Jon Phipps wrote:

>

>> Re: "Underneath it all you still have to have something that expresses

>> valid triples, n'est pas?"

>>

>> Actually, my point here is that there are many data serializations, models

>> and use cases for creating, validating, and distributing metadata and many

>> of them don't include a notion of triples, (e.g. nosql) although many of

>> them do include a notion of domain-specific validity and some form of

>> distribution. RDF is extremely useful for distributing metadata in an Open

>> World context, but it's hardly the only data model and hardly the only

>> method of distributing useful metadata.

>>

>> We need to provide, or at least try to provide, a specification that makes

>> it possible for an organization to describe how they expect the 'things'

>> they know about to be described: which properties are valid or not, what

>> constitutes valid data, and what does each property mean. In the old days,

>> this model used to be called a 'data dictionary' and it's an incredibly

>> useful concept in a world of distributed heterogeneous data. Providing a

>> way for someone to create a single 'data dictionary' that can be used

>> (preferably by a machine) to create validations for domain-specific data

>> and that can be used by anyone (preferably a machine) in the organization,

>> or alternatively in the world, to understand the meaning of that data

>> across departmental, organizational, or national boundaries would be

>> incredibly and fundamentally useful.

>>

>> If we say that RDF is the ONLY useful way to do this, then we might as

>> well

>> go back to "DCAM is just RDF".

>>

>> Jon

>>

>> I check email just a couple of times daily; to reach me sooner, click

>> here:

>> http://awayfind.com/jonphipps

>>

>>

>> On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle<[log in to unmask]>  wrote:

>>

>>  What *does* seem to be core in this blog post is the use of http URIs for

>>> values. I'd add to that: properties defined with http URIs, so you know

>>> what you are describing. Although you can serialize all of this in JSON

>>> if

>>> you wish, it means that you have started with LD concepts, not the usual

>>> JSON application. Underneath it all you still have to have something that

>>> expresses valid triples, n'est pas?

>>>

>>> kc

>>>

>>>

>>> On 2/15/12 5:49 AM, Jon Phipps wrote:

>>>

>>>  I've been doing some wandering around in JSON land for the last few days

>>>> and, as part of a continuing observation that RDF is an implementation

>>>> detail rather than a core requirement, I'd like to point to this post

>>>> from

>>>> James Snell

>>>> http://chmod777self.blogspot.****com/2012/02/mostly-linked-****

>>>> data.html<http://chmod777self.**blogspot.com/2012/02/mostly-**

>>>> linked-data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>

>>>> >

>>>>

>>>> And the JSON Scema spec: http://json-schema.org/

>>>>

>>>> Jon,

>>>> who may someday get his act together and pay attention to these meetings

>>>> more than a couple of hours before the meeting.

>>>>

>>>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]>   wrote:

>>>>

>>>>  On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>>>>>

>>>>>  --  that DCAM should be developed using a test-driven approach, with

>>>>>>     effective examples and test cases that can be expressed in various

>>>>>>     concrete syntaxes.

>>>>>>

>>>>>>

>>>>> Jon suggested that we take Gordon's requirements for metadata record

>>>>>

>>>>>  constructs

>>>>

>>>>  [1] as a starting point.  As I understand them, these are:

>>>>>

>>>>> --  the ability to encode multicomponent things (which in the

>>>>> cataloging

>>>>>    world happen to be called "statements", as in "publication

>>>>> statement"

>>>>>    and "classification statement") either:

>>>>>

>>>>>    -- as unstructured strings, or

>>>>>    -- as strings structured according to a named Syntax Encoding

>>>>> Scheme,

>>>>>

>>>>>  or

>>>>

>>>>     -- as Named Graphs with individual component triples

>>>>>

>>>>> --  the ability to express the repeatability of components in such

>>>>>

>>>>>  "statements"

>>>>

>>>>

>>>>> --  the ability to designate properties as "mandatory", or "mandatory

>>>>> if

>>>>>    applicable", and the like

>>>>>

>>>>> --  the ability to constrain the cardinality of "subsets of properties"

>>>>>    within a particular context, such as the FRBR model

>>>>>

>>>>> -- the ability to express mappings between properties in different

>>>>>

>>>>>  namespaces.

>>>>

>>>>

>>>>> It has also been suggested that we find examples of real metadata

>>>>> instance

>>>>> records from different communities and contexts -- e.g., libraries,

>>>>>

>>>>>  government,

>>>>

>>>>  industry, and biomed -- for both testing and illustrating DCAM

>>>>> constracts.

>>>>>

>>>>> Tom

>>>>>

>>>>> [1]

>>>>>

>>>>>  https://www.jiscmail.ac.uk/****cgi-bin/webadmin?A2=ind1202&L=****<https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**>

>>>> dc-architecture&P=6405<https:/**/www.jiscmail.ac.uk/cgi-bin/**

>>>> webadmin?A2=ind1202&L=dc-**architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>

>>>> >

>>>>

>>>>

>>>>> --

>>>>> Tom Baker<[log in to unmask]>

>>>>>

>>>>>

>>>>>

>>>>  --

>>> Karen Coyle

>>> [log in to unmask] http://kcoyle.net

>>> ph: 1-510-540-7596

>>> m: 1-510-435-8234

>>> skype: kcoylenet

>>>

>>>

>>

> --

> Karen Coyle

> [log in to unmask] http://kcoyle.net

> ph: 1-510-540-7596

> m: 1-510-435-8234

> skype: kcoylenet

>



------------------------------



Date:    Wed, 15 Feb 2012 14:03:20 -0800

From:    Karen Coyle <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



On 2/15/12 12:30 PM, Jon Phipps wrote:

> Are we only talking about Linked Data or are we talking about information

> modeling?



I doubt if it makes sense to take on the entirety of information

modeling within the DCAM. My impression was that the goal was

information modeling for the Semantic Web/linked data environment. If

it's broader than that, then we move up from abstract to something so

far out it may never be finished. I'd say that we should stick with an

abstraction that is an abstraction of something useful, not a pure

abstraction.



DCAP is a documentation model for describing an information

> ecosystem and DCAM is its formal abstract 'domain' model, or should be.



DCAP to me is narrower than that. It describes a coherent set of

statements for a particular metadata activity. It verges on being a

record format, although it is a "record format" in a data environment

that is more flexible than, say, a relational database with a set data

format. I'd equate the Singapore framework's domain model with an

"information ecosystem." That to me is the general model before you

start adding constraints, and perhaps even before you define your set of

properties.



And, as I said before, DCAM to me defines the design patterns that are

available to you. It is plausible to me that DCAM's patterns have some

universality, but I wouldn't want to embark on a task of making sure

that DCAM covers every single metadata possibility, known today or to be

discovered in the future. That would probably prevent DCAM from have

such specifics as "property URIs" or "literal values." It's going to be

hard enough to come up with a model that functions well within the

semantic web universe.



kc



> Whether or not that model results in Linked Data is beside the point, isn't

> it?







>

> Jon,

> who just found this and had to paste it here:

>

> Endless invention, endless experiment,

> Brings knowledge of motion, but not of stillness;

> Knowledge of speech, but not of silence;

> Knowledge of words, and ignorance of the Word...

>

> Where is the Life we have lost in living?

> Where is the wisdom we have lost in knowledge?

> Where is the knowledge we have lost in information?

>   -- T. S. Eliot, Choruses from 'The Rock'

>

>

> On Wed, Feb 15, 2012 at 11:26 AM, Karen Coyle<[log in to unmask]>  wrote:

>

>> Hmm. In terms of analogies, I would equate DCAP with a data dictionary,

>> not DCAM. To me a data dictionary is the actual metadata elements you will

>> use, not an abstract definition of the possible structures. DCAM seems to

>> be closer to the idea of "design patterns."

>>

>> I don't see how something can be linked data if it doesn't have certain

>> characteristics:

>> - http uris

>> - subjects, predicates, objects (whether serialized as triples or not, and

>> RDF/XML and turtle are examples of not)

>> - subjects and predicates constrained as URIs; objects constrained

>> differently (which DCAM would address)

>>

>> It's possible that the JSON examples in that blog post met these criteria

>> (I didn't perceive URIs for the predicates, but maybe I don't read JSON

>> well).

>>

>> kc

>>

>>

>> On 2/15/12 7:58 AM, Jon Phipps wrote:

>>

>>> Re: "Underneath it all you still have to have something that expresses

>>> valid triples, n'est pas?"

>>>

>>> Actually, my point here is that there are many data serializations, models

>>> and use cases for creating, validating, and distributing metadata and many

>>> of them don't include a notion of triples, (e.g. nosql) although many of

>>> them do include a notion of domain-specific validity and some form of

>>> distribution. RDF is extremely useful for distributing metadata in an Open

>>> World context, but it's hardly the only data model and hardly the only

>>> method of distributing useful metadata.

>>>

>>> We need to provide, or at least try to provide, a specification that makes

>>> it possible for an organization to describe how they expect the 'things'

>>> they know about to be described: which properties are valid or not, what

>>> constitutes valid data, and what does each property mean. In the old days,

>>> this model used to be called a 'data dictionary' and it's an incredibly

>>> useful concept in a world of distributed heterogeneous data. Providing a

>>> way for someone to create a single 'data dictionary' that can be used

>>> (preferably by a machine) to create validations for domain-specific data

>>> and that can be used by anyone (preferably a machine) in the organization,

>>> or alternatively in the world, to understand the meaning of that data

>>> across departmental, organizational, or national boundaries would be

>>> incredibly and fundamentally useful.

>>>

>>> If we say that RDF is the ONLY useful way to do this, then we might as

>>> well

>>> go back to "DCAM is just RDF".

>>>

>>> Jon

>>>

>>> I check email just a couple of times daily; to reach me sooner, click

>>> here:

>>> http://awayfind.com/jonphipps

>>>

>>>

>>> On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle<[log in to unmask]>   wrote:

>>>

>>>   What *does* seem to be core in this blog post is the use of http URIs for

>>>> values. I'd add to that: properties defined with http URIs, so you know

>>>> what you are describing. Although you can serialize all of this in JSON

>>>> if

>>>> you wish, it means that you have started with LD concepts, not the usual

>>>> JSON application. Underneath it all you still have to have something that

>>>> expresses valid triples, n'est pas?

>>>>

>>>> kc

>>>>

>>>>

>>>> On 2/15/12 5:49 AM, Jon Phipps wrote:

>>>>

>>>>   I've been doing some wandering around in JSON land for the last few days

>>>>> and, as part of a continuing observation that RDF is an implementation

>>>>> detail rather than a core requirement, I'd like to point to this post

>>>>> from

>>>>> James Snell

>>>>> http://chmod777self.blogspot.****com/2012/02/mostly-linked-****

>>>>> data.html<http://chmod777self.**blogspot.com/2012/02/mostly-**

>>>>> linked-data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>

>>>>>>

>>>>>

>>>>> And the JSON Scema spec: http://json-schema.org/

>>>>>

>>>>> Jon,

>>>>> who may someday get his act together and pay attention to these meetings

>>>>> more than a couple of hours before the meeting.

>>>>>

>>>>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]>    wrote:

>>>>>

>>>>>   On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>>>>>>

>>>>>>   --  that DCAM should be developed using a test-driven approach, with

>>>>>>>      effective examples and test cases that can be expressed in various

>>>>>>>      concrete syntaxes.

>>>>>>>

>>>>>>>

>>>>>> Jon suggested that we take Gordon's requirements for metadata record

>>>>>>

>>>>>>   constructs

>>>>>

>>>>>   [1] as a starting point.  As I understand them, these are:

>>>>>>

>>>>>> --  the ability to encode multicomponent things (which in the

>>>>>> cataloging

>>>>>>     world happen to be called "statements", as in "publication

>>>>>> statement"

>>>>>>     and "classification statement") either:

>>>>>>

>>>>>>     -- as unstructured strings, or

>>>>>>     -- as strings structured according to a named Syntax Encoding

>>>>>> Scheme,

>>>>>>

>>>>>>   or

>>>>>

>>>>>      -- as Named Graphs with individual component triples

>>>>>>

>>>>>> --  the ability to express the repeatability of components in such

>>>>>>

>>>>>>   "statements"

>>>>>

>>>>>

>>>>>> --  the ability to designate properties as "mandatory", or "mandatory

>>>>>> if

>>>>>>     applicable", and the like

>>>>>>

>>>>>> --  the ability to constrain the cardinality of "subsets of properties"

>>>>>>     within a particular context, such as the FRBR model

>>>>>>

>>>>>> -- the ability to express mappings between properties in different

>>>>>>

>>>>>>   namespaces.

>>>>>

>>>>>

>>>>>> It has also been suggested that we find examples of real metadata

>>>>>> instance

>>>>>> records from different communities and contexts -- e.g., libraries,

>>>>>>

>>>>>>   government,

>>>>>

>>>>>   industry, and biomed -- for both testing and illustrating DCAM

>>>>>> constracts.

>>>>>>

>>>>>> Tom

>>>>>>

>>>>>> [1]

>>>>>>

>>>>>>   https://www.jiscmail.ac.uk/****cgi-bin/webadmin?A2=ind1202&L=****<https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**>

>>>>> dc-architecture&P=6405<https:/**/www.jiscmail.ac.uk/cgi-bin/**

>>>>> webadmin?A2=ind1202&L=dc-**architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>

>>>>>>

>>>>>

>>>>>

>>>>>> --

>>>>>> Tom Baker<[log in to unmask]>

>>>>>>

>>>>>>

>>>>>>

>>>>>   --

>>>> Karen Coyle

>>>> [log in to unmask] http://kcoyle.net

>>>> ph: 1-510-540-7596

>>>> m: 1-510-435-8234

>>>> skype: kcoylenet

>>>>

>>>>

>>>

>> --

>> Karen Coyle

>> [log in to unmask] http://kcoyle.net

>> ph: 1-510-540-7596

>> m: 1-510-435-8234

>> skype: kcoylenet

>>

>



--

Karen Coyle

[log in to unmask] http://kcoyle.net

ph: 1-510-540-7596

m: 1-510-435-8234

skype: kcoylenet



------------------------------



Date:    Wed, 15 Feb 2012 14:06:13 -0800

From:    Karen Coyle <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



Oh, I meant to follow Jon's quoted poem with some words from Frank Zappa:



Information is not knowledge.

Knowledge is not wisdom.

Wisdom is not truth.

Truth is not beauty.

Beauty is not love.

Love is not music.

Music is the best.



On 2/15/12 12:30 PM, Jon Phipps wrote:

> Are we only talking about Linked Data or are we talking about information

> modeling? DCAP is a documentation model for describing an information

> ecosystem and DCAM is its formal abstract 'domain' model, or should be.

> Whether or not that model results in Linked Data is beside the point, isn't

> it?

>

> Jon,

> who just found this and had to paste it here:

>

> Endless invention, endless experiment,

> Brings knowledge of motion, but not of stillness;

> Knowledge of speech, but not of silence;

> Knowledge of words, and ignorance of the Word...

>

> Where is the Life we have lost in living?

> Where is the wisdom we have lost in knowledge?

> Where is the knowledge we have lost in information?

>   -- T. S. Eliot, Choruses from 'The Rock'

>

>

> On Wed, Feb 15, 2012 at 11:26 AM, Karen Coyle<[log in to unmask]>  wrote:

>

>> Hmm. In terms of analogies, I would equate DCAP with a data dictionary,

>> not DCAM. To me a data dictionary is the actual metadata elements you will

>> use, not an abstract definition of the possible structures. DCAM seems to

>> be closer to the idea of "design patterns."

>>

>> I don't see how something can be linked data if it doesn't have certain

>> characteristics:

>> - http uris

>> - subjects, predicates, objects (whether serialized as triples or not, and

>> RDF/XML and turtle are examples of not)

>> - subjects and predicates constrained as URIs; objects constrained

>> differently (which DCAM would address)

>>

>> It's possible that the JSON examples in that blog post met these criteria

>> (I didn't perceive URIs for the predicates, but maybe I don't read JSON

>> well).

>>

>> kc

>>

>>

>> On 2/15/12 7:58 AM, Jon Phipps wrote:

>>

>>> Re: "Underneath it all you still have to have something that expresses

>>> valid triples, n'est pas?"

>>>

>>> Actually, my point here is that there are many data serializations, models

>>> and use cases for creating, validating, and distributing metadata and many

>>> of them don't include a notion of triples, (e.g. nosql) although many of

>>> them do include a notion of domain-specific validity and some form of

>>> distribution. RDF is extremely useful for distributing metadata in an Open

>>> World context, but it's hardly the only data model and hardly the only

>>> method of distributing useful metadata.

>>>

>>> We need to provide, or at least try to provide, a specification that makes

>>> it possible for an organization to describe how they expect the 'things'

>>> they know about to be described: which properties are valid or not, what

>>> constitutes valid data, and what does each property mean. In the old days,

>>> this model used to be called a 'data dictionary' and it's an incredibly

>>> useful concept in a world of distributed heterogeneous data. Providing a

>>> way for someone to create a single 'data dictionary' that can be used

>>> (preferably by a machine) to create validations for domain-specific data

>>> and that can be used by anyone (preferably a machine) in the organization,

>>> or alternatively in the world, to understand the meaning of that data

>>> across departmental, organizational, or national boundaries would be

>>> incredibly and fundamentally useful.

>>>

>>> If we say that RDF is the ONLY useful way to do this, then we might as

>>> well

>>> go back to "DCAM is just RDF".

>>>

>>> Jon

>>>

>>> I check email just a couple of times daily; to reach me sooner, click

>>> here:

>>> http://awayfind.com/jonphipps

>>>

>>>

>>> On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle<[log in to unmask]>   wrote:

>>>

>>>   What *does* seem to be core in this blog post is the use of http URIs for

>>>> values. I'd add to that: properties defined with http URIs, so you know

>>>> what you are describing. Although you can serialize all of this in JSON

>>>> if

>>>> you wish, it means that you have started with LD concepts, not the usual

>>>> JSON application. Underneath it all you still have to have something that

>>>> expresses valid triples, n'est pas?

>>>>

>>>> kc

>>>>

>>>>

>>>> On 2/15/12 5:49 AM, Jon Phipps wrote:

>>>>

>>>>   I've been doing some wandering around in JSON land for the last few days

>>>>> and, as part of a continuing observation that RDF is an implementation

>>>>> detail rather than a core requirement, I'd like to point to this post

>>>>> from

>>>>> James Snell

>>>>> http://chmod777self.blogspot.****com/2012/02/mostly-linked-****

>>>>> data.html<http://chmod777self.**blogspot.com/2012/02/mostly-**

>>>>> linked-data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>

>>>>>>

>>>>>

>>>>> And the JSON Scema spec: http://json-schema.org/

>>>>>

>>>>> Jon,

>>>>> who may someday get his act together and pay attention to these meetings

>>>>> more than a couple of hours before the meeting.

>>>>>

>>>>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]>    wrote:

>>>>>

>>>>>   On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>>>>>>

>>>>>>   --  that DCAM should be developed using a test-driven approach, with

>>>>>>>      effective examples and test cases that can be expressed in various

>>>>>>>      concrete syntaxes.

>>>>>>>

>>>>>>>

>>>>>> Jon suggested that we take Gordon's requirements for metadata record

>>>>>>

>>>>>>   constructs

>>>>>

>>>>>   [1] as a starting point.  As I understand them, these are:

>>>>>>

>>>>>> --  the ability to encode multicomponent things (which in the

>>>>>> cataloging

>>>>>>     world happen to be called "statements", as in "publication

>>>>>> statement"

>>>>>>     and "classification statement") either:

>>>>>>

>>>>>>     -- as unstructured strings, or

>>>>>>     -- as strings structured according to a named Syntax Encoding

>>>>>> Scheme,

>>>>>>

>>>>>>   or

>>>>>

>>>>>      -- as Named Graphs with individual component triples

>>>>>>

>>>>>> --  the ability to express the repeatability of components in such

>>>>>>

>>>>>>   "statements"

>>>>>

>>>>>

>>>>>> --  the ability to designate properties as "mandatory", or "mandatory

>>>>>> if

>>>>>>     applicable", and the like

>>>>>>

>>>>>> --  the ability to constrain the cardinality of "subsets of properties"

>>>>>>     within a particular context, such as the FRBR model

>>>>>>

>>>>>> -- the ability to express mappings between properties in different

>>>>>>

>>>>>>   namespaces.

>>>>>

>>>>>

>>>>>> It has also been suggested that we find examples of real metadata

>>>>>> instance

>>>>>> records from different communities and contexts -- e.g., libraries,

>>>>>>

>>>>>>   government,

>>>>>

>>>>>   industry, and biomed -- for both testing and illustrating DCAM

>>>>>> constracts.

>>>>>>

>>>>>> Tom

>>>>>>

>>>>>> [1]

>>>>>>

>>>>>>   https://www.jiscmail.ac.uk/****cgi-bin/webadmin?A2=ind1202&L=****<https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**>

>>>>> dc-architecture&P=6405<https:/**/www.jiscmail.ac.uk/cgi-bin/**

>>>>> webadmin?A2=ind1202&L=dc-**architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>

>>>>>>

>>>>>

>>>>>

>>>>>> --

>>>>>> Tom Baker<[log in to unmask]>

>>>>>>

>>>>>>

>>>>>>

>>>>>   --

>>>> Karen Coyle

>>>> [log in to unmask] http://kcoyle.net

>>>> ph: 1-510-540-7596

>>>> m: 1-510-435-8234

>>>> skype: kcoylenet

>>>>

>>>>

>>>

>> --

>> Karen Coyle

>> [log in to unmask] http://kcoyle.net

>> ph: 1-510-540-7596

>> m: 1-510-435-8234

>> skype: kcoylenet

>>

>



--

Karen Coyle

[log in to unmask] http://kcoyle.net

ph: 1-510-540-7596

m: 1-510-435-8234

skype: kcoylenet



------------------------------



End of DC-ARCHITECTURE Digest - 14 Feb 2012 to 15 Feb 2012 (#2012-23)

*********************************************************************
Top of Message | Previous Page | Permalink
JiscMail Tools

Files Area | help
RSS Feeds and Sharing

Search Archives

Advanced Options