JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for DC-ARCHITECTURE Archives


DC-ARCHITECTURE Archives

DC-ARCHITECTURE Archives


DC-ARCHITECTURE@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

DC-ARCHITECTURE Home

DC-ARCHITECTURE Home

DC-ARCHITECTURE  March 2012

DC-ARCHITECTURE March 2012

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: DC-ARCHITECTURE Digest - 14 Feb 2012 to 15 Feb 2012 (#2012-23)

From:

"Bombardier, Kevin C" <[log in to unmask]>

Reply-To:

DCMI Architecture Forum <[log in to unmask]>

Date:

Tue, 13 Mar 2012 08:13:51 -0500

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (1 lines)

Please update my email address to be [log in to unmask]



Thanks



-----Original Message-----

From: DCMI Architecture Forum [mailto:[log in to unmask]] On Behalf Of DC-ARCHITECTURE automatic digest system

Sent: Wednesday, February 15, 2012 7:03 PM

To: [log in to unmask]

Subject: DC-ARCHITECTURE Digest - 14 Feb 2012 to 15 Feb 2012 (#2012-23)



There are 14 messages totaling 2954 lines in this issue.



Topics of the day:



  1. DCAM - where we stand (2)

  2. DCAM telecon - additional links to Gap Analysis and Son of Dublin Core

  3. DCAM - collecting requirements and examples (9)

  4. Just some food for thought... (2)



----------------------------------------------------------------------



Date:    Wed, 15 Feb 2012 10:12:45 +0000

From:    "Greenberg, Jane" <[log in to unmask]>

Subject: Re: DCAM - where we stand



Tom, all ...







I really like this initial stab at a general message, and my sense is that the use of the word 'slots' will resonate with folks.   At least this is what I think at the moment, and it made the description easy to understand for me.  This is what is needed in the user-facing documentation and can reach beyond those immersed in DCAM..and not scare those who are new to this information.







Two brief comments --



~ In sentence one, I wanted to say "connected" slots, but I'm not sure it's necessary.  The indication of a 'defined structure' at the end can stand for this.  I'm just thinking that is it the linking (connecting) of slots that is key.







~  Perhaps a given, but I think it would be good to list a few examples beyond book.  My bias perhaps, but could we list things like a dataset, and image, and I would like to list person and event?  (Do 'Things' like person or event cause a problem?  They may be seen as unorthodox in DCAM... I'm not sure, or in conflict w/ FRBR?)







Best wishes, jane







-----Original Message-----

From: DCMI Architecture Forum [mailto:[log in to unmask]] On Behalf Of Thomas Baker

Sent: Wednesday, February 15, 2012 12:05 AM

To: [log in to unmask]

Subject: Re: DCAM - where we stand







On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:



> And there was some support for the idea that a clear "punchline" for



> DCAM, along the lines of the one-liner summarizing SKOS in plain



> language, would be helpful.







A bit longer than the one-liner for SKOS, but here's my stab at a "general message" for DCAM:







    Seen as data artifacts, Metadata Records consist of slots holding



    information items in a defined structure.  A Metadata Record may describe a



    single Thing of interest (such as a Book) or a cluster of closely related



    Things (such as a Book and its Author).  More abstractly, a Metadata Record



    may be seen as a Description Set encompassing just one Description (i.e.,



    about the Book) or multiple Descriptions (about both the Book and the



    Author).







    A Description consists of one or more Statements about the Thing Described



    (e.g., stating the Name and Birthdate of an Author).  The Thing Described



    by a Description may be identified using a URI.  A Statement about the



    Thing Described has one slot for an Attribute (Property) and one slot for a



    Value.  Attribute slots are filled with names of attributes (properties);



    in DCAM, attributes are "named" using URIs.  Value slots are filled with



    Value Strings, URIs, or blank Value Placeholders.  A Value String may be



    stated as belonging to a named set of strings (known as a Syntax Encoding



    Scheme).  A Value URI may be stated as belonging to a named set of URIs



    (known as a Vocabulary Encoding Scheme).  In practice, Statements may be



    viewed in the context of Statement Sets.  Statement Sets may follow common



    patterns.







    The Dublin Core Abstract Model (DCAM) provides a language for representing



    the structure of specific Metadata Records -- put more abstractly, to



    specify a Description Set Profile -- in a form that is independent of



    particular Concrete Encoding Technologies such as XML Schema, RDF/XML,



    RelaxNG, relational databases, Schematron, or JSON.







    In order to provide compatibility with Semantic Web and Linked Data



    applications, however, DCAM is fully aligned with the Model and Abstract



    Syntax of RDF.  (Note that the RDF abstract model is the basis for -- thus



    distinct from -- concrete RDF encoding technologies such as RDF/XML,



   N-Triples, and Turtle.) Knowledge of RDF is not a prerequisite for



    understanding DCAM on an informal level.







    DCAM provides a language for expressing common patterns of Statements --



    patterns that may be partially or fully encoded using specific Concrete



    Encoding Technologies.  Indeed, some readers may find the example patterns



    used in designing DCAM more accessible and useful, as models and templates



    for implementation, than the formal specification of DCAM itself.







Details aside, this text illustrates the sort of high-level description I think we would need to have as an explanation both to our intended audience -- and to guide ourselves in the design phase.  I'm not sure whether the mixing of references to syntax ("slots", "Value URI") and semantics ("Thing Described") in this draft is a bug -- or a feature.  I also wonder how DCAM can close the gap to expressing the constraints of real application profiles without introducing DC-DSP-like notions such as "templates" and "constraints" [1].







For discussion...







Tom







[1] http://dublincore.org/documents/dc-dsp/







--



Tom Baker <[log in to unmask]<mailto:[log in to unmask]>>



------------------------------



Date:    Wed, 15 Feb 2012 08:34:54 -0500

From:    Jon Phipps <[log in to unmask]>

Subject: Re: DCAM telecon - additional links to Gap Analysis and Son of Dublin Core



I was always very impressed by SoDC. It's incomplete, but the notion of a

concrete syntax that can be used to express semantics in one branch and

constraints in another with an exemplar processing facility has always made

enormous sense to me and would seem to be a good use case for a DCAP/DCAM

spec.



Jon



On Tuesday, February 14, 2012, Thomas Baker <[log in to unmask]> wrote:

> On Tue, Feb 14, 2012 at 01:32:58PM -0500, Tom Baker wrote:

>> Date:       2012-02-15 Wednesday, 1100 EST

>> Expected:   Tom Baker (chair), Mark Matienzo, Antoine Isaac, Stuart

Sutton, Aaron Rubinstein,

>>             Phipps, Gordon Dunsire, Kai Eckert

>> Regrets:    Corey Harper, Richard Urban

>

> On tomorrow's call I'd like to continue the discussion, even though

> we will have some important absences.

>

> I noticed just after posting that I hadn't added the link to the

(beginnings of

> a) gap analysis [1].

>

> At Corey's request, I included a link to Alistair Miles's Son of Dublin

Core

> (below).

>

> Tom

>

> [1] http://wiki.dublincore.org/index.php/DCAM_Revision_Gap_Analysis

>

>> -- Alistair Miles's Son of Dublin Core (SoDC)

>>    http://aliman.googlecode.com/svn/trunk/sodc/SoDC-0.2/index.html

>>

http://aliman.googlecode.com/svn/trunk/sodc/SoDC-0.2/release/SoDC-0_2.zip -

everything, zipped

>

> --

> Tom Baker <[log in to unmask]>

>



--

Jon



I check email just a couple of times daily; to reach me sooner, click here:

http://awayfind.com/jonphipps



------------------------------



Date:    Wed, 15 Feb 2012 08:49:32 -0500

From:    Jon Phipps <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



I've been doing some wandering around in JSON land for the last few days

and, as part of a continuing observation that RDF is an implementation

detail rather than a core requirement, I'd like to point to this post from

James Snell

http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html

And the JSON Scema spec: http://json-schema.org/



Jon,

who may someday get his act together and pay attention to these meetings

more than a couple of hours before the meeting.



On Tuesday, February 14, 2012, Thomas Baker <[log in to unmask]> wrote:

> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>> --  that DCAM should be developed using a test-driven approach, with

>>     effective examples and test cases that can be expressed in various

>>     concrete syntaxes.

>

> Jon suggested that we take Gordon's requirements for metadata record

constructs

> [1] as a starting point.  As I understand them, these are:

>

> --  the ability to encode multicomponent things (which in the cataloging

>    world happen to be called "statements", as in "publication statement"

>    and "classification statement") either:

>

>    -- as unstructured strings, or

>    -- as strings structured according to a named Syntax Encoding Scheme,

or

>    -- as Named Graphs with individual component triples

>

> --  the ability to express the repeatability of components in such

"statements"

>

> --  the ability to designate properties as "mandatory", or "mandatory if

>    applicable", and the like

>

> --  the ability to constrain the cardinality of "subsets of properties"

>    within a particular context, such as the FRBR model

>

> -- the ability to express mappings between properties in different

namespaces.

>

> It has also been suggested that we find examples of real metadata instance

> records from different communities and contexts -- e.g., libraries,

government,

> industry, and biomed -- for both testing and illustrating DCAM constracts.

>

> Tom

>

> [1]

https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405

>

> --

> Tom Baker <[log in to unmask]>

>



--

Jon



I check email just a couple of times daily; to reach me sooner, click here:

http://awayfind.com/jonphipps



------------------------------



Date:    Wed, 15 Feb 2012 06:47:02 -0800

From:    Karen Coyle <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



What *does* seem to be core in this blog post is the use of http URIs

for values. I'd add to that: properties defined with http URIs, so you

know what you are describing. Although you can serialize all of this in

JSON if you wish, it means that you have started with LD concepts, not

the usual JSON application. Underneath it all you still have to have

something that expresses valid triples, n'est pas?



kc



On 2/15/12 5:49 AM, Jon Phipps wrote:

> I've been doing some wandering around in JSON land for the last few days

> and, as part of a continuing observation that RDF is an implementation

> detail rather than a core requirement, I'd like to point to this post from

> James Snell

> http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html

> And the JSON Scema spec: http://json-schema.org/

>

> Jon,

> who may someday get his act together and pay attention to these meetings

> more than a couple of hours before the meeting.

>

> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]>  wrote:

>> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>>> --  that DCAM should be developed using a test-driven approach, with

>>>      effective examples and test cases that can be expressed in various

>>>      concrete syntaxes.

>>

>> Jon suggested that we take Gordon's requirements for metadata record

> constructs

>> [1] as a starting point.  As I understand them, these are:

>>

>> --  the ability to encode multicomponent things (which in the cataloging

>>     world happen to be called "statements", as in "publication statement"

>>     and "classification statement") either:

>>

>>     -- as unstructured strings, or

>>     -- as strings structured according to a named Syntax Encoding Scheme,

> or

>>     -- as Named Graphs with individual component triples

>>

>> --  the ability to express the repeatability of components in such

> "statements"

>>

>> --  the ability to designate properties as "mandatory", or "mandatory if

>>     applicable", and the like

>>

>> --  the ability to constrain the cardinality of "subsets of properties"

>>     within a particular context, such as the FRBR model

>>

>> -- the ability to express mappings between properties in different

> namespaces.

>>

>> It has also been suggested that we find examples of real metadata instance

>> records from different communities and contexts -- e.g., libraries,

> government,

>> industry, and biomed -- for both testing and illustrating DCAM constracts.

>>

>> Tom

>>

>> [1]

> https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405

>>

>> --

>> Tom Baker<[log in to unmask]>

>>

>



--

Karen Coyle

[log in to unmask] http://kcoyle.net

ph: 1-510-540-7596

m: 1-510-435-8234

skype: kcoylenet



------------------------------



Date:    Wed, 15 Feb 2012 09:38:37 -0500

From:    Richard Urban <[log in to unmask]>

Subject: Re: DCAM - where we stand



Hi everyone,



Sorry I won't be able to join the call today, so I thought I'd send a few comments to the list.



Tom,  I believe "slots" are a new introduction to the DCAM (at least from my reading),  but it does seem to introduce the possibility of confusion with how "slots" are used in frame-based languages (http://en.wikipedia.org/wiki/Frame_language).   Is the concept of slots we are introducing here equivalent to those kinds of slots or are we introducing a DCAM specific concept?  If the latter,  it may help to spell out what the features of these slots are (beyond properties/values).



> I'm not sure whether the mixing of

> references to syntax ("slots", "Value URI") and semantics ("Thing Described")

> in this draft is a bug -- or a feature.



So "slots" are a syntactical concept?  If we are basing DCAM on the RDF model, it seems that "semantics" also incorporates the abstract construction of grammatical features like "statements."   I would therefore expect "slots" to be at that abstract level.



Following that thread, are DCAM Statements still just be property/value slots, or are they now more like a triple, with a (for lack of a better term) slot that holds a URI that refers to the Thing Described?   An important part of preserving intuitive sense of colloquial records is that such URIs are optional.  I think this is equivalent to RDFs concept of blank nodes, but I don't think that connection has been explicitly drawn in DCAM. (Kai?)  If DCAM is standing slightly apart from RDF to accomodate colloquial XML records[1] , does our sense of statements/described resource URIs align with this concept?  The Linked Data community is discouraging blank nodes, so we could imagine a DCAM that also requires URIs for all Things.  (I suspect that this is not what we want to do, but putting it out there for discussion).  (see "In order to provide compatibility with Semantic Web and Linked Data applications, however, DCAM is fully aligned with the Model and Abstract Syntax of RDF").   It also seems that Linked Data's requirement to including Thing URIs is the kind of constraints that we might be modeling at the level of DCAM and might afford some useful real-world examples of how it's done.  ( how is this similar/different to the kinds of constraints we need to express at the DSP level?)



Lastly,  since we are also trying to work backwards from XML,  we frequently using the term "records" in discussing DCAM. i.e.



>  The Dublin Core Abstract Model (DCAM) provides a language for representing

>    the structure of specific Metadata Records -- put more abstractly, to

>    specify a Description Set Profile -- in a form that is independent of

>    particular Concrete Encoding Technologies such as XML Schema, RDF/XML,

>    RelaxNG, relational databases, Schematron, or JSON.





I don't think DCAM is after Description Set Profiles directly,  rather it models our intuitive sense of "records" as an abstract "Description Set." (which then enables us to specify a DSP).   I think it does pretty well at this, but I'm curious if there are objections to "Description Sets" that should go into our gap analysis.  Jon et al. are there things about the kinds of records you work with that don't fit the current DCAM model?



Richard J. Urban, Visiting Professor

School of Library and Information Studies

College of Communication and Information

Florida State University

[log in to unmask]

@musebrarian





[1] Sperberg-McQueen,  C.M. and Miller, E. On mapping from colloquial XML to RDF using XSLT.  Proceedings of Extreme Markup Languages 2004. http://conferences.idealliance.org/extreme/html/2004/Sperberg-McQueen01/EML2004Sperberg-McQueen01.html



On Feb 14, 2012, at 6:04 PM, Thomas Baker wrote:



> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>> And there was some support for the idea that a clear "punchline" for

>> DCAM, along the lines of the one-liner summarizing SKOS in plain language,

>> would be helpful.

>

> A bit longer than the one-liner for SKOS, but here's my stab at

> a "general message" for DCAM:

>

>    Seen as data artifacts, Metadata Records consist of slots holding

>    information items in a defined structure.  A Metadata Record may describe a

>    single Thing of interest (such as a Book) or a cluster of closely related

>    Things (such as a Book and its Author).  More abstractly, a Metadata Record

>    may be seen as a Description Set encompassing just one Description (i.e.,

>    about the Book) or multiple Descriptions (about both the Book and the

>    Author).

>

>    A Description consists of one or more Statements about the Thing Described

>    (e.g., stating the Name and Birthdate of an Author).  The Thing Described

>    by a Description may be identified using a URI.  A Statement about the

>    Thing Described has one slot for an Attribute (Property) and one slot for a

>    Value.  Attribute slots are filled with names of attributes (properties);

>    in DCAM, attributes are "named" using URIs.  Value slots are filled with

>    Value Strings, URIs, or blank Value Placeholders.  A Value String may be

>    stated as belonging to a named set of strings (known as a Syntax Encoding

>    Scheme).  A Value URI may be stated as belonging to a named set of URIs

>    (known as a Vocabulary Encoding Scheme).  In practice, Statements may be

>    viewed in the context of Statement Sets.  Statement Sets may follow common

>    patterns.

>

>    The Dublin Core Abstract Model (DCAM) provides a language for representing

>    the structure of specific Metadata Records -- put more abstractly, to

>    specify a Description Set Profile -- in a form that is independent of

>    particular Concrete Encoding Technologies such as XML Schema, RDF/XML,

>    RelaxNG, relational databases, Schematron, or JSON.

>

>    In order to provide compatibility with Semantic Web and Linked Data

>    applications, however, DCAM is fully aligned with the Model and Abstract

>    Syntax of RDF.  (Note that the RDF abstract model is the basis for -- thus

>    distinct from -- concrete RDF encoding technologies such as RDF/XML,

>    N-Triples, and Turtle.) Knowledge of RDF is not a prerequisite for

>    understanding DCAM on an informal level.

>

>    DCAM provides a language for expressing common patterns of Statements --

>    patterns that may be partially or fully encoded using specific Concrete

>    Encoding Technologies.  Indeed, some readers may find the example patterns

>    used in designing DCAM more accessible and useful, as models and templates

>    for implementation, than the formal specification of DCAM itself.

>

> Details aside, this text illustrates the sort of high-level description I think

> we would need to have as an explanation both to our intended audience -- and to

> guide ourselves in the design phase.  I'm not sure whether the mixing of

> references to syntax ("slots", "Value URI") and semantics ("Thing Described")

> in this draft is a bug -- or a feature.  I also wonder how DCAM can close the

> gap to expressing the constraints of real application profiles without

> introducing DC-DSP-like notions such as "templates" and "constraints" [1].

>

> For discussion...

>

> Tom

>

> [1] http://dublincore.org/documents/dc-dsp/

>

> --

> Tom Baker <[log in to unmask]>

>



------------------------------



Date:    Wed, 15 Feb 2012 09:50:03 -0500

From:    Richard Urban <[log in to unmask]>

Subject: Re: Just some food for thought...



Cory/Karen,



Are there any good summaries of the conversations in Seattle (relevant to this discussion) for those of us who didn't make it to #c4lib?



Thanks,

Richard



On Feb 5, 2012, at 3:25 PM, Karen Coyle wrote:



> On 2/2/12 10:14 AM, Corey A Harper wrote:

>

>>

>> I'm open to other suggestions about where we can reach out to for some

>> additional perspective.

>

> It's not only a matter of "where" it's a matter of "how." We've all been in on the lengthy conversations about terminology (and some of us went through that again at length at a meeting in Seattle last week). You can't expect much when you invite Russian speakers to a discussion taking place only in Latin. The DCAM terminology is a barrier. You can claim that

> 1) that terminology is necessary

> 2) people need to make the effort to learn it

>

> but that approach may not lead to success, as I believe is the case with the current version of DCAM. Reaching out should mean at least meeting people half way and doing all that is possible to bring them along. "Sink or swim" isn't an invitation.

>

> I actually believe that the utility of DCAM must be and can be expressed in terms of things people know and need to accomplish in their own environments. Examples and use cases will be a big help. That may even been a good place to start on this "round 2" effort: looking at what DCAM gives us as practitioners could reveal what else is needed, if anything, from such a model.

>

> kc

>

>>

>> On Thu, Feb 2, 2012 at 8:17 AM, Bruce D'Arcus<[log in to unmask]>  wrote:

>>> On Thu, Feb 2, 2012 at 10:14 AM, Jon Phipps<[log in to unmask]>  wrote:

>>>> This post represents an interesting perspective from the scientific data

>>>> community on some of the challenges to implementing semantic web solutions

>>>> and integrating them into existing system architectures and programming

>>>> models. This certainly looks to me like a place where the DCAP/DCAM

>>>> architecture coupled with some concrete implementation examples could be of

>>>> benefit...

>>>

>>> FWIW, I think the issue is much less about models than it is about the

>>> other stuff.

>>>

>>> Bruce

>>

>>

>>

>

> --

> Karen Coyle

> [log in to unmask] http://kcoyle.net

> ph: 1-510-540-7596

> m: 1-510-435-8234

> skype: kcoylenet



------------------------------



Date:    Wed, 15 Feb 2012 10:55:01 -0500

From:    Thomas Baker <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



On Wed, Feb 15, 2012 at 08:49:32AM -0500, Jon Phipps wrote:

> I've been doing some wandering around in JSON land for the last few days

> and, as part of a continuing observation that RDF is an implementation

> detail rather than a core requirement, I'd like to point to this post from

> James Snell

> http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html

> And the JSON Scema spec: http://json-schema.org/



It looks to me like he considers RDF to be a "format" and, as such,

comparable to JSON.  Commenting on [1], he writes:



    Reading on a little further, the document goes on to expand on that third

    point, "In order to enable a wide range of different applications to

    process Web content, it is important to agree on standardized content

    formats. The agreement on HTML as a dominant document format was an

    important factor that made the Web scale. The third Linked Data principle

    therefore advocates use of a single data model for publishing structured

    data on the Web – the Resource Description Framework (RDF), a simple

    graph-based data model that has been designed for use in the context of the

    Web [70]. The RDF data model is explained in more detail later in this

    chapter."



    I can absolutely agree with the first part -- that standardized content

    formats are critical. But the "single data model" bit makes me twitch. We

    don't need a single data model.. what we need are common conventions for

    pulling out the bits of information we need regardless of the specific

    format used.



...i.e., in my reading, he is equating "data model" with a "specific format".

As I proposed yesterday, I think it is important to distinguish between RDF

"the model and abstract syntax" and RDF/XML "the concrete serialization syntax,

or format" -- not to mention other concrete RDF syntaxes such as N-Triples and

Turtle -- in DCAM's general message:



    The Dublin Core Abstract Model (DCAM) provides a language for representing

    the structure of specific Metadata Records -- put more abstractly, to

    specify a Description Set Profile -- in a form that is independent of

    particular Concrete Encoding Technologies such as XML Schema, RDF/XML,

    RelaxNG, relational databases, Schematron, or JSON.



    In order to provide compatibility with Semantic Web and Linked Data

    applications, however, DCAM is fully aligned with the Model and Abstract

    Syntax of RDF.  (Note that the RDF abstract model is the basis for -- thus

    distinct from -- concrete RDF encoding technologies such as RDF/XML,

    N-Triples, and Turtle.) Knowledge of RDF is not a prerequisite for

    understanding DCAM on an informal level.



It would help if we could agree on a way to characterize this distinction

(e.g., "Concrete Encoding Technologies" versus "Model and Abstract Syntax").



Unless I'm missing the point of his argument, I do not think James Snell is

proposing JSON Activity Streams as a generic abstract syntax -- something which

would compete with RDF as a "grammatical" basis for interoperability in Linked

Data.  He emphasizes his point that "If you're familiar with Activity Streams

and the linking extensions, then you'll know exactly what to do with this."

That seems consistent with what we want to do with DCAM -- with the added

distinction that if a JSON format is aligned with DCAM, and DCAM is aligned

with RDF, then one would in principle be able to express the contents of a JSON

format using an RDF concrete syntax.  Indeed, James's formulation that "what we

need are common conventions for pulling out the bits of information we need

regardless of the specific format used" could almost be used verbatim in a

description of the DCAM we are discussing.



Jon writes:

> as part of a continuing observation that RDF is an implementation

> detail rather than a core requirement...



I am coming around to the idea that DCAM (or at any rate, "DCAM 2") might be

presented informally without emphasizing RDF, and that some people might find

such a DCAM useful as a very high-level way to conceptualize metadata (i.e.,

Statements, composed of Slots for information and grouped into Descriptions and

Description Sets, following common design patterns, etc...) I still do not see

the value of specifying a DCAM that is anything less than perfectly aligned

with the RDF Model and Abstract Syntax.  That people may take inspiration from

such an RDF-grounded model, ignoring the RDF basis, is not something we should

worry about.  But RDF, such as it is, is the only common _grammatical_ basis

for data that we currently have, and not to ground DCAM in RDF would make it

useless for the purposes of RDF-based interoperability.



Tom



[1] http://linkeddatabook.com/editions/1.0/





--

Tom Baker <[log in to unmask]>



------------------------------



Date:    Wed, 15 Feb 2012 10:58:53 -0500

From:    Jon Phipps <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



Re: "Underneath it all you still have to have something that expresses

valid triples, n'est pas?"



Actually, my point here is that there are many data serializations, models

and use cases for creating, validating, and distributing metadata and many

of them don't include a notion of triples, (e.g. nosql) although many of

them do include a notion of domain-specific validity and some form of

distribution. RDF is extremely useful for distributing metadata in an Open

World context, but it's hardly the only data model and hardly the only

method of distributing useful metadata.



We need to provide, or at least try to provide, a specification that makes

it possible for an organization to describe how they expect the 'things'

they know about to be described: which properties are valid or not, what

constitutes valid data, and what does each property mean. In the old days,

this model used to be called a 'data dictionary' and it's an incredibly

useful concept in a world of distributed heterogeneous data. Providing a

way for someone to create a single 'data dictionary' that can be used

(preferably by a machine) to create validations for domain-specific data

and that can be used by anyone (preferably a machine) in the organization,

or alternatively in the world, to understand the meaning of that data

across departmental, organizational, or national boundaries would be

incredibly and fundamentally useful.



If we say that RDF is the ONLY useful way to do this, then we might as well

go back to "DCAM is just RDF".



Jon



I check email just a couple of times daily; to reach me sooner, click here:

http://awayfind.com/jonphipps





On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle <[log in to unmask]> wrote:



> What *does* seem to be core in this blog post is the use of http URIs for

> values. I'd add to that: properties defined with http URIs, so you know

> what you are describing. Although you can serialize all of this in JSON if

> you wish, it means that you have started with LD concepts, not the usual

> JSON application. Underneath it all you still have to have something that

> expresses valid triples, n'est pas?

>

> kc

>

>

> On 2/15/12 5:49 AM, Jon Phipps wrote:

>

>> I've been doing some wandering around in JSON land for the last few days

>> and, as part of a continuing observation that RDF is an implementation

>> detail rather than a core requirement, I'd like to point to this post from

>> James Snell

>> http://chmod777self.blogspot.**com/2012/02/mostly-linked-**data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>

>> And the JSON Scema spec: http://json-schema.org/

>>

>> Jon,

>> who may someday get his act together and pay attention to these meetings

>> more than a couple of hours before the meeting.

>>

>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]>  wrote:

>>

>>> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>>>

>>>> --  that DCAM should be developed using a test-driven approach, with

>>>>     effective examples and test cases that can be expressed in various

>>>>     concrete syntaxes.

>>>>

>>>

>>> Jon suggested that we take Gordon's requirements for metadata record

>>>

>> constructs

>>

>>> [1] as a starting point.  As I understand them, these are:

>>>

>>> --  the ability to encode multicomponent things (which in the cataloging

>>>    world happen to be called "statements", as in "publication statement"

>>>    and "classification statement") either:

>>>

>>>    -- as unstructured strings, or

>>>    -- as strings structured according to a named Syntax Encoding Scheme,

>>>

>> or

>>

>>>    -- as Named Graphs with individual component triples

>>>

>>> --  the ability to express the repeatability of components in such

>>>

>> "statements"

>>

>>>

>>> --  the ability to designate properties as "mandatory", or "mandatory if

>>>    applicable", and the like

>>>

>>> --  the ability to constrain the cardinality of "subsets of properties"

>>>    within a particular context, such as the FRBR model

>>>

>>> -- the ability to express mappings between properties in different

>>>

>> namespaces.

>>

>>>

>>> It has also been suggested that we find examples of real metadata

>>> instance

>>> records from different communities and contexts -- e.g., libraries,

>>>

>> government,

>>

>>> industry, and biomed -- for both testing and illustrating DCAM

>>> constracts.

>>>

>>> Tom

>>>

>>> [1]

>>>

>> https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**

>> dc-architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>

>>

>>>

>>> --

>>> Tom Baker<[log in to unmask]>

>>>

>>>

>>

> --

> Karen Coyle

> [log in to unmask] http://kcoyle.net

> ph: 1-510-540-7596

> m: 1-510-435-8234

> skype: kcoylenet

>



------------------------------



Date:    Wed, 15 Feb 2012 11:07:49 -0500

From:    Jon Phipps <[log in to unmask]>

Subject: Re: Just some food for thought...



On Sun, Feb 5, 2012 at 3:25 PM, Karen Coyle <[log in to unmask]> wrote:



> I actually believe that the utility of DCAM must be and can be expressed

> in terms of things people know and need to accomplish in their own

> environments. Examples and use cases will be a big help. That may even been

> a good place to start on this "round 2" effort: looking at what DCAM gives

> us as practitioners could reveal what else is needed, if anything, from

> such a model.





+1  :-)



------------------------------



Date:    Wed, 15 Feb 2012 08:26:12 -0800

From:    Karen Coyle <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



Hmm. In terms of analogies, I would equate DCAP with a data dictionary,

not DCAM. To me a data dictionary is the actual metadata elements you

will use, not an abstract definition of the possible structures. DCAM

seems to be closer to the idea of "design patterns."



I don't see how something can be linked data if it doesn't have certain

characteristics:

- http uris

- subjects, predicates, objects (whether serialized as triples or not,

and RDF/XML and turtle are examples of not)

- subjects and predicates constrained as URIs; objects constrained

differently (which DCAM would address)



It's possible that the JSON examples in that blog post met these

criteria (I didn't perceive URIs for the predicates, but maybe I don't

read JSON well).



kc



On 2/15/12 7:58 AM, Jon Phipps wrote:

> Re: "Underneath it all you still have to have something that expresses

> valid triples, n'est pas?"

>

> Actually, my point here is that there are many data serializations, models

> and use cases for creating, validating, and distributing metadata and many

> of them don't include a notion of triples, (e.g. nosql) although many of

> them do include a notion of domain-specific validity and some form of

> distribution. RDF is extremely useful for distributing metadata in an Open

> World context, but it's hardly the only data model and hardly the only

> method of distributing useful metadata.

>

> We need to provide, or at least try to provide, a specification that makes

> it possible for an organization to describe how they expect the 'things'

> they know about to be described: which properties are valid or not, what

> constitutes valid data, and what does each property mean. In the old days,

> this model used to be called a 'data dictionary' and it's an incredibly

> useful concept in a world of distributed heterogeneous data. Providing a

> way for someone to create a single 'data dictionary' that can be used

> (preferably by a machine) to create validations for domain-specific data

> and that can be used by anyone (preferably a machine) in the organization,

> or alternatively in the world, to understand the meaning of that data

> across departmental, organizational, or national boundaries would be

> incredibly and fundamentally useful.

>

> If we say that RDF is the ONLY useful way to do this, then we might as well

> go back to "DCAM is just RDF".

>

> Jon

>

> I check email just a couple of times daily; to reach me sooner, click here:

> http://awayfind.com/jonphipps

>

>

> On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle<[log in to unmask]>  wrote:

>

>> What *does* seem to be core in this blog post is the use of http URIs for

>> values. I'd add to that: properties defined with http URIs, so you know

>> what you are describing. Although you can serialize all of this in JSON if

>> you wish, it means that you have started with LD concepts, not the usual

>> JSON application. Underneath it all you still have to have something that

>> expresses valid triples, n'est pas?

>>

>> kc

>>

>>

>> On 2/15/12 5:49 AM, Jon Phipps wrote:

>>

>>> I've been doing some wandering around in JSON land for the last few days

>>> and, as part of a continuing observation that RDF is an implementation

>>> detail rather than a core requirement, I'd like to point to this post from

>>> James Snell

>>> http://chmod777self.blogspot.**com/2012/02/mostly-linked-**data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>

>>> And the JSON Scema spec: http://json-schema.org/

>>>

>>> Jon,

>>> who may someday get his act together and pay attention to these meetings

>>> more than a couple of hours before the meeting.

>>>

>>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]>   wrote:

>>>

>>>> On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>>>>

>>>>> --  that DCAM should be developed using a test-driven approach, with

>>>>>      effective examples and test cases that can be expressed in various

>>>>>      concrete syntaxes.

>>>>>

>>>>

>>>> Jon suggested that we take Gordon's requirements for metadata record

>>>>

>>> constructs

>>>

>>>> [1] as a starting point.  As I understand them, these are:

>>>>

>>>> --  the ability to encode multicomponent things (which in the cataloging

>>>>     world happen to be called "statements", as in "publication statement"

>>>>     and "classification statement") either:

>>>>

>>>>     -- as unstructured strings, or

>>>>     -- as strings structured according to a named Syntax Encoding Scheme,

>>>>

>>> or

>>>

>>>>     -- as Named Graphs with individual component triples

>>>>

>>>> --  the ability to express the repeatability of components in such

>>>>

>>> "statements"

>>>

>>>>

>>>> --  the ability to designate properties as "mandatory", or "mandatory if

>>>>     applicable", and the like

>>>>

>>>> --  the ability to constrain the cardinality of "subsets of properties"

>>>>     within a particular context, such as the FRBR model

>>>>

>>>> -- the ability to express mappings between properties in different

>>>>

>>> namespaces.

>>>

>>>>

>>>> It has also been suggested that we find examples of real metadata

>>>> instance

>>>> records from different communities and contexts -- e.g., libraries,

>>>>

>>> government,

>>>

>>>> industry, and biomed -- for both testing and illustrating DCAM

>>>> constracts.

>>>>

>>>> Tom

>>>>

>>>> [1]

>>>>

>>> https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**

>>> dc-architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>

>>>

>>>>

>>>> --

>>>> Tom Baker<[log in to unmask]>

>>>>

>>>>

>>>

>> --

>> Karen Coyle

>> [log in to unmask] http://kcoyle.net

>> ph: 1-510-540-7596

>> m: 1-510-435-8234

>> skype: kcoylenet

>>

>



--

Karen Coyle

[log in to unmask] http://kcoyle.net

ph: 1-510-540-7596

m: 1-510-435-8234

skype: kcoylenet



------------------------------



Date:    Wed, 15 Feb 2012 18:30:09 +0100

From:    Kai Eckert <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



Hi all,



as a strong promoter for the RDF basis for DCAM, I would like to

emphasize, too, that RDF is only the formal model and should not be seen

as a concrete syntax. JSON is a syntax, it has no semantics. I like it

very much, and I like simple, pragmatic implementations, but that's not

what we need in our current context.



In the W3C provenance WG, we just had the experience, that it is much

easier to discuss a model that is defined in a formal language, in

contrast to plain English, which lead to endless discussions before. We

now focus on the formal PROV ontology, written in OWL, to reach a

consensus about the model. Additionally, we of course create documents

in plain English (at least) that hopefully explain and demonstrate what

can be done with the model. But these drafts can not be used to define

the model in the first place.



I think the only formal language that we all speak is {RDF,RDFS, OWL},

that's why I want to focus on the definition of everything that we are

talking about in DCAM with this language. In that respect, it is more a

side-effect that this would end in actually being RDF. If we face

limitations in this formal language that we can not accept, then of

course we should not restrict ourselves to RDF. But only then.



Cheers,



Kai





Am 15.02.2012 16:55, schrieb Thomas Baker:

> On Wed, Feb 15, 2012 at 08:49:32AM -0500, Jon Phipps wrote:

>> I've been doing some wandering around in JSON land for the last few days

>> and, as part of a continuing observation that RDF is an implementation

>> detail rather than a core requirement, I'd like to point to this post from

>> James Snell

>> http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html

>> And the JSON Scema spec: http://json-schema.org/

>

> It looks to me like he considers RDF to be a "format" and, as such,

> comparable to JSON.  Commenting on [1], he writes:

>

>      Reading on a little further, the document goes on to expand on that third

>      point, "In order to enable a wide range of different applications to

>      process Web content, it is important to agree on standardized content

>      formats. The agreement on HTML as a dominant document format was an

>      important factor that made the Web scale. The third Linked Data principle

>      therefore advocates use of a single data model for publishing structured

>      data on the Web – the Resource Description Framework (RDF), a simple

>      graph-based data model that has been designed for use in the context of the

>      Web [70]. The RDF data model is explained in more detail later in this

>      chapter."

>

>      I can absolutely agree with the first part -- that standardized content

>      formats are critical. But the "single data model" bit makes me twitch. We

>      don't need a single data model.. what we need are common conventions for

>      pulling out the bits of information we need regardless of the specific

>      format used.

>

> ...i.e., in my reading, he is equating "data model" with a "specific format".

> As I proposed yesterday, I think it is important to distinguish between RDF

> "the model and abstract syntax" and RDF/XML "the concrete serialization syntax,

> or format" -- not to mention other concrete RDF syntaxes such as N-Triples and

> Turtle -- in DCAM's general message:

>

>      The Dublin Core Abstract Model (DCAM) provides a language for representing

>      the structure of specific Metadata Records -- put more abstractly, to

>      specify a Description Set Profile -- in a form that is independent of

>      particular Concrete Encoding Technologies such as XML Schema, RDF/XML,

>      RelaxNG, relational databases, Schematron, or JSON.

>

>      In order to provide compatibility with Semantic Web and Linked Data

>      applications, however, DCAM is fully aligned with the Model and Abstract

>      Syntax of RDF.  (Note that the RDF abstract model is the basis for -- thus

>      distinct from -- concrete RDF encoding technologies such as RDF/XML,

>      N-Triples, and Turtle.) Knowledge of RDF is not a prerequisite for

>      understanding DCAM on an informal level.

>

> It would help if we could agree on a way to characterize this distinction

> (e.g., "Concrete Encoding Technologies" versus "Model and Abstract Syntax").

>

> Unless I'm missing the point of his argument, I do not think James Snell is

> proposing JSON Activity Streams as a generic abstract syntax -- something which

> would compete with RDF as a "grammatical" basis for interoperability in Linked

> Data.  He emphasizes his point that "If you're familiar with Activity Streams

> and the linking extensions, then you'll know exactly what to do with this."

> That seems consistent with what we want to do with DCAM -- with the added

> distinction that if a JSON format is aligned with DCAM, and DCAM is aligned

> with RDF, then one would in principle be able to express the contents of a JSON

> format using an RDF concrete syntax.  Indeed, James's formulation that "what we

> need are common conventions for pulling out the bits of information we need

> regardless of the specific format used" could almost be used verbatim in a

> description of the DCAM we are discussing.

>

> Jon writes:

>> as part of a continuing observation that RDF is an implementation

>> detail rather than a core requirement...

>

> I am coming around to the idea that DCAM (or at any rate, "DCAM 2") might be

> presented informally without emphasizing RDF, and that some people might find

> such a DCAM useful as a very high-level way to conceptualize metadata (i.e.,

> Statements, composed of Slots for information and grouped into Descriptions and

> Description Sets, following common design patterns, etc...) I still do not see

> the value of specifying a DCAM that is anything less than perfectly aligned

> with the RDF Model and Abstract Syntax.  That people may take inspiration from

> such an RDF-grounded model, ignoring the RDF basis, is not something we should

> worry about.  But RDF, such as it is, is the only common _grammatical_ basis

> for data that we currently have, and not to ground DCAM in RDF would make it

> useless for the purposes of RDF-based interoperability.

>

> Tom

>

> [1] http://linkeddatabook.com/editions/1.0/

>

>



--

Kai Eckert

Universitätsbibliothek Mannheim

Stellv. Leiter Abteilung Digitale Bibliotheksdienste

Schloss Schneckhof West / 68131 Mannheim

Tel. 0621/181-2946 Fax 0621/181-2918



------------------------------



Date:    Wed, 15 Feb 2012 15:30:35 -0500

From:    Jon Phipps <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



Are we only talking about Linked Data or are we talking about information

modeling? DCAP is a documentation model for describing an information

ecosystem and DCAM is its formal abstract 'domain' model, or should be.

Whether or not that model results in Linked Data is beside the point, isn't

it?



Jon,

who just found this and had to paste it here:



Endless invention, endless experiment,

Brings knowledge of motion, but not of stillness;

Knowledge of speech, but not of silence;

Knowledge of words, and ignorance of the Word...



Where is the Life we have lost in living?

Where is the wisdom we have lost in knowledge?

Where is the knowledge we have lost in information?

 -- T. S. Eliot, Choruses from 'The Rock'





On Wed, Feb 15, 2012 at 11:26 AM, Karen Coyle <[log in to unmask]> wrote:



> Hmm. In terms of analogies, I would equate DCAP with a data dictionary,

> not DCAM. To me a data dictionary is the actual metadata elements you will

> use, not an abstract definition of the possible structures. DCAM seems to

> be closer to the idea of "design patterns."

>

> I don't see how something can be linked data if it doesn't have certain

> characteristics:

> - http uris

> - subjects, predicates, objects (whether serialized as triples or not, and

> RDF/XML and turtle are examples of not)

> - subjects and predicates constrained as URIs; objects constrained

> differently (which DCAM would address)

>

> It's possible that the JSON examples in that blog post met these criteria

> (I didn't perceive URIs for the predicates, but maybe I don't read JSON

> well).

>

> kc

>

>

> On 2/15/12 7:58 AM, Jon Phipps wrote:

>

>> Re: "Underneath it all you still have to have something that expresses

>> valid triples, n'est pas?"

>>

>> Actually, my point here is that there are many data serializations, models

>> and use cases for creating, validating, and distributing metadata and many

>> of them don't include a notion of triples, (e.g. nosql) although many of

>> them do include a notion of domain-specific validity and some form of

>> distribution. RDF is extremely useful for distributing metadata in an Open

>> World context, but it's hardly the only data model and hardly the only

>> method of distributing useful metadata.

>>

>> We need to provide, or at least try to provide, a specification that makes

>> it possible for an organization to describe how they expect the 'things'

>> they know about to be described: which properties are valid or not, what

>> constitutes valid data, and what does each property mean. In the old days,

>> this model used to be called a 'data dictionary' and it's an incredibly

>> useful concept in a world of distributed heterogeneous data. Providing a

>> way for someone to create a single 'data dictionary' that can be used

>> (preferably by a machine) to create validations for domain-specific data

>> and that can be used by anyone (preferably a machine) in the organization,

>> or alternatively in the world, to understand the meaning of that data

>> across departmental, organizational, or national boundaries would be

>> incredibly and fundamentally useful.

>>

>> If we say that RDF is the ONLY useful way to do this, then we might as

>> well

>> go back to "DCAM is just RDF".

>>

>> Jon

>>

>> I check email just a couple of times daily; to reach me sooner, click

>> here:

>> http://awayfind.com/jonphipps

>>

>>

>> On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle<[log in to unmask]>  wrote:

>>

>>  What *does* seem to be core in this blog post is the use of http URIs for

>>> values. I'd add to that: properties defined with http URIs, so you know

>>> what you are describing. Although you can serialize all of this in JSON

>>> if

>>> you wish, it means that you have started with LD concepts, not the usual

>>> JSON application. Underneath it all you still have to have something that

>>> expresses valid triples, n'est pas?

>>>

>>> kc

>>>

>>>

>>> On 2/15/12 5:49 AM, Jon Phipps wrote:

>>>

>>>  I've been doing some wandering around in JSON land for the last few days

>>>> and, as part of a continuing observation that RDF is an implementation

>>>> detail rather than a core requirement, I'd like to point to this post

>>>> from

>>>> James Snell

>>>> http://chmod777self.blogspot.****com/2012/02/mostly-linked-****

>>>> data.html<http://chmod777self.**blogspot.com/2012/02/mostly-**

>>>> linked-data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>

>>>> >

>>>>

>>>> And the JSON Scema spec: http://json-schema.org/

>>>>

>>>> Jon,

>>>> who may someday get his act together and pay attention to these meetings

>>>> more than a couple of hours before the meeting.

>>>>

>>>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]>   wrote:

>>>>

>>>>  On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>>>>>

>>>>>  --  that DCAM should be developed using a test-driven approach, with

>>>>>>     effective examples and test cases that can be expressed in various

>>>>>>     concrete syntaxes.

>>>>>>

>>>>>>

>>>>> Jon suggested that we take Gordon's requirements for metadata record

>>>>>

>>>>>  constructs

>>>>

>>>>  [1] as a starting point.  As I understand them, these are:

>>>>>

>>>>> --  the ability to encode multicomponent things (which in the

>>>>> cataloging

>>>>>    world happen to be called "statements", as in "publication

>>>>> statement"

>>>>>    and "classification statement") either:

>>>>>

>>>>>    -- as unstructured strings, or

>>>>>    -- as strings structured according to a named Syntax Encoding

>>>>> Scheme,

>>>>>

>>>>>  or

>>>>

>>>>     -- as Named Graphs with individual component triples

>>>>>

>>>>> --  the ability to express the repeatability of components in such

>>>>>

>>>>>  "statements"

>>>>

>>>>

>>>>> --  the ability to designate properties as "mandatory", or "mandatory

>>>>> if

>>>>>    applicable", and the like

>>>>>

>>>>> --  the ability to constrain the cardinality of "subsets of properties"

>>>>>    within a particular context, such as the FRBR model

>>>>>

>>>>> -- the ability to express mappings between properties in different

>>>>>

>>>>>  namespaces.

>>>>

>>>>

>>>>> It has also been suggested that we find examples of real metadata

>>>>> instance

>>>>> records from different communities and contexts -- e.g., libraries,

>>>>>

>>>>>  government,

>>>>

>>>>  industry, and biomed -- for both testing and illustrating DCAM

>>>>> constracts.

>>>>>

>>>>> Tom

>>>>>

>>>>> [1]

>>>>>

>>>>>  https://www.jiscmail.ac.uk/****cgi-bin/webadmin?A2=ind1202&L=****<https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**>

>>>> dc-architecture&P=6405<https:/**/www.jiscmail.ac.uk/cgi-bin/**

>>>> webadmin?A2=ind1202&L=dc-**architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>

>>>> >

>>>>

>>>>

>>>>> --

>>>>> Tom Baker<[log in to unmask]>

>>>>>

>>>>>

>>>>>

>>>>  --

>>> Karen Coyle

>>> [log in to unmask] http://kcoyle.net

>>> ph: 1-510-540-7596

>>> m: 1-510-435-8234

>>> skype: kcoylenet

>>>

>>>

>>

> --

> Karen Coyle

> [log in to unmask] http://kcoyle.net

> ph: 1-510-540-7596

> m: 1-510-435-8234

> skype: kcoylenet

>



------------------------------



Date:    Wed, 15 Feb 2012 14:03:20 -0800

From:    Karen Coyle <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



On 2/15/12 12:30 PM, Jon Phipps wrote:

> Are we only talking about Linked Data or are we talking about information

> modeling?



I doubt if it makes sense to take on the entirety of information

modeling within the DCAM. My impression was that the goal was

information modeling for the Semantic Web/linked data environment. If

it's broader than that, then we move up from abstract to something so

far out it may never be finished. I'd say that we should stick with an

abstraction that is an abstraction of something useful, not a pure

abstraction.



DCAP is a documentation model for describing an information

> ecosystem and DCAM is its formal abstract 'domain' model, or should be.



DCAP to me is narrower than that. It describes a coherent set of

statements for a particular metadata activity. It verges on being a

record format, although it is a "record format" in a data environment

that is more flexible than, say, a relational database with a set data

format. I'd equate the Singapore framework's domain model with an

"information ecosystem." That to me is the general model before you

start adding constraints, and perhaps even before you define your set of

properties.



And, as I said before, DCAM to me defines the design patterns that are

available to you. It is plausible to me that DCAM's patterns have some

universality, but I wouldn't want to embark on a task of making sure

that DCAM covers every single metadata possibility, known today or to be

discovered in the future. That would probably prevent DCAM from have

such specifics as "property URIs" or "literal values." It's going to be

hard enough to come up with a model that functions well within the

semantic web universe.



kc



> Whether or not that model results in Linked Data is beside the point, isn't

> it?







>

> Jon,

> who just found this and had to paste it here:

>

> Endless invention, endless experiment,

> Brings knowledge of motion, but not of stillness;

> Knowledge of speech, but not of silence;

> Knowledge of words, and ignorance of the Word...

>

> Where is the Life we have lost in living?

> Where is the wisdom we have lost in knowledge?

> Where is the knowledge we have lost in information?

>   -- T. S. Eliot, Choruses from 'The Rock'

>

>

> On Wed, Feb 15, 2012 at 11:26 AM, Karen Coyle<[log in to unmask]>  wrote:

>

>> Hmm. In terms of analogies, I would equate DCAP with a data dictionary,

>> not DCAM. To me a data dictionary is the actual metadata elements you will

>> use, not an abstract definition of the possible structures. DCAM seems to

>> be closer to the idea of "design patterns."

>>

>> I don't see how something can be linked data if it doesn't have certain

>> characteristics:

>> - http uris

>> - subjects, predicates, objects (whether serialized as triples or not, and

>> RDF/XML and turtle are examples of not)

>> - subjects and predicates constrained as URIs; objects constrained

>> differently (which DCAM would address)

>>

>> It's possible that the JSON examples in that blog post met these criteria

>> (I didn't perceive URIs for the predicates, but maybe I don't read JSON

>> well).

>>

>> kc

>>

>>

>> On 2/15/12 7:58 AM, Jon Phipps wrote:

>>

>>> Re: "Underneath it all you still have to have something that expresses

>>> valid triples, n'est pas?"

>>>

>>> Actually, my point here is that there are many data serializations, models

>>> and use cases for creating, validating, and distributing metadata and many

>>> of them don't include a notion of triples, (e.g. nosql) although many of

>>> them do include a notion of domain-specific validity and some form of

>>> distribution. RDF is extremely useful for distributing metadata in an Open

>>> World context, but it's hardly the only data model and hardly the only

>>> method of distributing useful metadata.

>>>

>>> We need to provide, or at least try to provide, a specification that makes

>>> it possible for an organization to describe how they expect the 'things'

>>> they know about to be described: which properties are valid or not, what

>>> constitutes valid data, and what does each property mean. In the old days,

>>> this model used to be called a 'data dictionary' and it's an incredibly

>>> useful concept in a world of distributed heterogeneous data. Providing a

>>> way for someone to create a single 'data dictionary' that can be used

>>> (preferably by a machine) to create validations for domain-specific data

>>> and that can be used by anyone (preferably a machine) in the organization,

>>> or alternatively in the world, to understand the meaning of that data

>>> across departmental, organizational, or national boundaries would be

>>> incredibly and fundamentally useful.

>>>

>>> If we say that RDF is the ONLY useful way to do this, then we might as

>>> well

>>> go back to "DCAM is just RDF".

>>>

>>> Jon

>>>

>>> I check email just a couple of times daily; to reach me sooner, click

>>> here:

>>> http://awayfind.com/jonphipps

>>>

>>>

>>> On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle<[log in to unmask]>   wrote:

>>>

>>>   What *does* seem to be core in this blog post is the use of http URIs for

>>>> values. I'd add to that: properties defined with http URIs, so you know

>>>> what you are describing. Although you can serialize all of this in JSON

>>>> if

>>>> you wish, it means that you have started with LD concepts, not the usual

>>>> JSON application. Underneath it all you still have to have something that

>>>> expresses valid triples, n'est pas?

>>>>

>>>> kc

>>>>

>>>>

>>>> On 2/15/12 5:49 AM, Jon Phipps wrote:

>>>>

>>>>   I've been doing some wandering around in JSON land for the last few days

>>>>> and, as part of a continuing observation that RDF is an implementation

>>>>> detail rather than a core requirement, I'd like to point to this post

>>>>> from

>>>>> James Snell

>>>>> http://chmod777self.blogspot.****com/2012/02/mostly-linked-****

>>>>> data.html<http://chmod777self.**blogspot.com/2012/02/mostly-**

>>>>> linked-data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>

>>>>>>

>>>>>

>>>>> And the JSON Scema spec: http://json-schema.org/

>>>>>

>>>>> Jon,

>>>>> who may someday get his act together and pay attention to these meetings

>>>>> more than a couple of hours before the meeting.

>>>>>

>>>>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]>    wrote:

>>>>>

>>>>>   On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>>>>>>

>>>>>>   --  that DCAM should be developed using a test-driven approach, with

>>>>>>>      effective examples and test cases that can be expressed in various

>>>>>>>      concrete syntaxes.

>>>>>>>

>>>>>>>

>>>>>> Jon suggested that we take Gordon's requirements for metadata record

>>>>>>

>>>>>>   constructs

>>>>>

>>>>>   [1] as a starting point.  As I understand them, these are:

>>>>>>

>>>>>> --  the ability to encode multicomponent things (which in the

>>>>>> cataloging

>>>>>>     world happen to be called "statements", as in "publication

>>>>>> statement"

>>>>>>     and "classification statement") either:

>>>>>>

>>>>>>     -- as unstructured strings, or

>>>>>>     -- as strings structured according to a named Syntax Encoding

>>>>>> Scheme,

>>>>>>

>>>>>>   or

>>>>>

>>>>>      -- as Named Graphs with individual component triples

>>>>>>

>>>>>> --  the ability to express the repeatability of components in such

>>>>>>

>>>>>>   "statements"

>>>>>

>>>>>

>>>>>> --  the ability to designate properties as "mandatory", or "mandatory

>>>>>> if

>>>>>>     applicable", and the like

>>>>>>

>>>>>> --  the ability to constrain the cardinality of "subsets of properties"

>>>>>>     within a particular context, such as the FRBR model

>>>>>>

>>>>>> -- the ability to express mappings between properties in different

>>>>>>

>>>>>>   namespaces.

>>>>>

>>>>>

>>>>>> It has also been suggested that we find examples of real metadata

>>>>>> instance

>>>>>> records from different communities and contexts -- e.g., libraries,

>>>>>>

>>>>>>   government,

>>>>>

>>>>>   industry, and biomed -- for both testing and illustrating DCAM

>>>>>> constracts.

>>>>>>

>>>>>> Tom

>>>>>>

>>>>>> [1]

>>>>>>

>>>>>>   https://www.jiscmail.ac.uk/****cgi-bin/webadmin?A2=ind1202&L=****<https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**>

>>>>> dc-architecture&P=6405<https:/**/www.jiscmail.ac.uk/cgi-bin/**

>>>>> webadmin?A2=ind1202&L=dc-**architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>

>>>>>>

>>>>>

>>>>>

>>>>>> --

>>>>>> Tom Baker<[log in to unmask]>

>>>>>>

>>>>>>

>>>>>>

>>>>>   --

>>>> Karen Coyle

>>>> [log in to unmask] http://kcoyle.net

>>>> ph: 1-510-540-7596

>>>> m: 1-510-435-8234

>>>> skype: kcoylenet

>>>>

>>>>

>>>

>> --

>> Karen Coyle

>> [log in to unmask] http://kcoyle.net

>> ph: 1-510-540-7596

>> m: 1-510-435-8234

>> skype: kcoylenet

>>

>



--

Karen Coyle

[log in to unmask] http://kcoyle.net

ph: 1-510-540-7596

m: 1-510-435-8234

skype: kcoylenet



------------------------------



Date:    Wed, 15 Feb 2012 14:06:13 -0800

From:    Karen Coyle <[log in to unmask]>

Subject: Re: DCAM - collecting requirements and examples



Oh, I meant to follow Jon's quoted poem with some words from Frank Zappa:



Information is not knowledge.

Knowledge is not wisdom.

Wisdom is not truth.

Truth is not beauty.

Beauty is not love.

Love is not music.

Music is the best.



On 2/15/12 12:30 PM, Jon Phipps wrote:

> Are we only talking about Linked Data or are we talking about information

> modeling? DCAP is a documentation model for describing an information

> ecosystem and DCAM is its formal abstract 'domain' model, or should be.

> Whether or not that model results in Linked Data is beside the point, isn't

> it?

>

> Jon,

> who just found this and had to paste it here:

>

> Endless invention, endless experiment,

> Brings knowledge of motion, but not of stillness;

> Knowledge of speech, but not of silence;

> Knowledge of words, and ignorance of the Word...

>

> Where is the Life we have lost in living?

> Where is the wisdom we have lost in knowledge?

> Where is the knowledge we have lost in information?

>   -- T. S. Eliot, Choruses from 'The Rock'

>

>

> On Wed, Feb 15, 2012 at 11:26 AM, Karen Coyle<[log in to unmask]>  wrote:

>

>> Hmm. In terms of analogies, I would equate DCAP with a data dictionary,

>> not DCAM. To me a data dictionary is the actual metadata elements you will

>> use, not an abstract definition of the possible structures. DCAM seems to

>> be closer to the idea of "design patterns."

>>

>> I don't see how something can be linked data if it doesn't have certain

>> characteristics:

>> - http uris

>> - subjects, predicates, objects (whether serialized as triples or not, and

>> RDF/XML and turtle are examples of not)

>> - subjects and predicates constrained as URIs; objects constrained

>> differently (which DCAM would address)

>>

>> It's possible that the JSON examples in that blog post met these criteria

>> (I didn't perceive URIs for the predicates, but maybe I don't read JSON

>> well).

>>

>> kc

>>

>>

>> On 2/15/12 7:58 AM, Jon Phipps wrote:

>>

>>> Re: "Underneath it all you still have to have something that expresses

>>> valid triples, n'est pas?"

>>>

>>> Actually, my point here is that there are many data serializations, models

>>> and use cases for creating, validating, and distributing metadata and many

>>> of them don't include a notion of triples, (e.g. nosql) although many of

>>> them do include a notion of domain-specific validity and some form of

>>> distribution. RDF is extremely useful for distributing metadata in an Open

>>> World context, but it's hardly the only data model and hardly the only

>>> method of distributing useful metadata.

>>>

>>> We need to provide, or at least try to provide, a specification that makes

>>> it possible for an organization to describe how they expect the 'things'

>>> they know about to be described: which properties are valid or not, what

>>> constitutes valid data, and what does each property mean. In the old days,

>>> this model used to be called a 'data dictionary' and it's an incredibly

>>> useful concept in a world of distributed heterogeneous data. Providing a

>>> way for someone to create a single 'data dictionary' that can be used

>>> (preferably by a machine) to create validations for domain-specific data

>>> and that can be used by anyone (preferably a machine) in the organization,

>>> or alternatively in the world, to understand the meaning of that data

>>> across departmental, organizational, or national boundaries would be

>>> incredibly and fundamentally useful.

>>>

>>> If we say that RDF is the ONLY useful way to do this, then we might as

>>> well

>>> go back to "DCAM is just RDF".

>>>

>>> Jon

>>>

>>> I check email just a couple of times daily; to reach me sooner, click

>>> here:

>>> http://awayfind.com/jonphipps

>>>

>>>

>>> On Wed, Feb 15, 2012 at 9:47 AM, Karen Coyle<[log in to unmask]>   wrote:

>>>

>>>   What *does* seem to be core in this blog post is the use of http URIs for

>>>> values. I'd add to that: properties defined with http URIs, so you know

>>>> what you are describing. Although you can serialize all of this in JSON

>>>> if

>>>> you wish, it means that you have started with LD concepts, not the usual

>>>> JSON application. Underneath it all you still have to have something that

>>>> expresses valid triples, n'est pas?

>>>>

>>>> kc

>>>>

>>>>

>>>> On 2/15/12 5:49 AM, Jon Phipps wrote:

>>>>

>>>>   I've been doing some wandering around in JSON land for the last few days

>>>>> and, as part of a continuing observation that RDF is an implementation

>>>>> detail rather than a core requirement, I'd like to point to this post

>>>>> from

>>>>> James Snell

>>>>> http://chmod777self.blogspot.****com/2012/02/mostly-linked-****

>>>>> data.html<http://chmod777self.**blogspot.com/2012/02/mostly-**

>>>>> linked-data.html<http://chmod777self.blogspot.com/2012/02/mostly-linked-data.html>

>>>>>>

>>>>>

>>>>> And the JSON Scema spec: http://json-schema.org/

>>>>>

>>>>> Jon,

>>>>> who may someday get his act together and pay attention to these meetings

>>>>> more than a couple of hours before the meeting.

>>>>>

>>>>> On Tuesday, February 14, 2012, Thomas Baker<[log in to unmask]>    wrote:

>>>>>

>>>>>   On Tue, Feb 14, 2012 at 05:25:17PM -0500, Tom Baker wrote:

>>>>>>

>>>>>>   --  that DCAM should be developed using a test-driven approach, with

>>>>>>>      effective examples and test cases that can be expressed in various

>>>>>>>      concrete syntaxes.

>>>>>>>

>>>>>>>

>>>>>> Jon suggested that we take Gordon's requirements for metadata record

>>>>>>

>>>>>>   constructs

>>>>>

>>>>>   [1] as a starting point.  As I understand them, these are:

>>>>>>

>>>>>> --  the ability to encode multicomponent things (which in the

>>>>>> cataloging

>>>>>>     world happen to be called "statements", as in "publication

>>>>>> statement"

>>>>>>     and "classification statement") either:

>>>>>>

>>>>>>     -- as unstructured strings, or

>>>>>>     -- as strings structured according to a named Syntax Encoding

>>>>>> Scheme,

>>>>>>

>>>>>>   or

>>>>>

>>>>>      -- as Named Graphs with individual component triples

>>>>>>

>>>>>> --  the ability to express the repeatability of components in such

>>>>>>

>>>>>>   "statements"

>>>>>

>>>>>

>>>>>> --  the ability to designate properties as "mandatory", or "mandatory

>>>>>> if

>>>>>>     applicable", and the like

>>>>>>

>>>>>> --  the ability to constrain the cardinality of "subsets of properties"

>>>>>>     within a particular context, such as the FRBR model

>>>>>>

>>>>>> -- the ability to express mappings between properties in different

>>>>>>

>>>>>>   namespaces.

>>>>>

>>>>>

>>>>>> It has also been suggested that we find examples of real metadata

>>>>>> instance

>>>>>> records from different communities and contexts -- e.g., libraries,

>>>>>>

>>>>>>   government,

>>>>>

>>>>>   industry, and biomed -- for both testing and illustrating DCAM

>>>>>> constracts.

>>>>>>

>>>>>> Tom

>>>>>>

>>>>>> [1]

>>>>>>

>>>>>>   https://www.jiscmail.ac.uk/****cgi-bin/webadmin?A2=ind1202&L=****<https://www.jiscmail.ac.uk/**cgi-bin/webadmin?A2=ind1202&L=**>

>>>>> dc-architecture&P=6405<https:/**/www.jiscmail.ac.uk/cgi-bin/**

>>>>> webadmin?A2=ind1202&L=dc-**architecture&P=6405<https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1202&L=dc-architecture&P=6405>

>>>>>>

>>>>>

>>>>>

>>>>>> --

>>>>>> Tom Baker<[log in to unmask]>

>>>>>>

>>>>>>

>>>>>>

>>>>>   --

>>>> Karen Coyle

>>>> [log in to unmask] http://kcoyle.net

>>>> ph: 1-510-540-7596

>>>> m: 1-510-435-8234

>>>> skype: kcoylenet

>>>>

>>>>

>>>

>> --

>> Karen Coyle

>> [log in to unmask] http://kcoyle.net

>> ph: 1-510-540-7596

>> m: 1-510-435-8234

>> skype: kcoylenet

>>

>



--

Karen Coyle

[log in to unmask] http://kcoyle.net

ph: 1-510-540-7596

m: 1-510-435-8234

skype: kcoylenet



------------------------------



End of DC-ARCHITECTURE Digest - 14 Feb 2012 to 15 Feb 2012 (#2012-23)

*********************************************************************

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

February 2024
January 2024
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
September 2022
August 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
November 2021
October 2021
September 2021
August 2021
July 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
September 2020
August 2020
July 2020
June 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
September 2005
August 2005
July 2005
June 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
March 2004
February 2004
January 2004
November 2003
October 2003
September 2003
August 2003
June 2003
May 2003
April 2003
March 2003
January 2003
December 2002
November 2002
October 2002
September 2002
August 2002
July 2002
June 2002
May 2002
April 2002
March 2002
February 2002
January 2002
December 2001
November 2001
October 2001
September 2001
August 2001
July 2001
June 2001
May 2001
April 2001
March 2001
February 2001
December 2000
November 2000
October 2000


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager