Hi Corey,
No doubt these tensions exist, and in fact it is exactly these tensions which gave rise to the DCAM in the first place (RDF vs. "pragmatic" OAI-DC XML syntaxes). Since these are fundamentally different paradigms, I don't foresee an end to the debate.
The problem, as I see it, is that these tensions have made the DCAM documentation very confusing to the very people we want to help make good, pragmatic decisions about how to implement Dublin Core. Out of one side of our mouth, we say things like "an SES is an rdf:datatype (and proceed to give a paraphrase of the accepted definition of a datatype" while out of the other, we say "well no, we want to count things like ISBD areas as SESs, even if they aren't *really* datatypes.
If the latter is what we want, for the sake of practicality, then what we say in the DCAM documentation needs to be relaxed. But if we also want to set the bar higher - in order to move things forward - the previous RDF concepts need to be represented as well. To do both seems to require representing each of the concepts explicitly rather than conflating them together as the same concept.
Richard
On Jul 13, 2012, at 4:52 PM, Corey A Harper wrote:
> Dear All,
>
> In preparation for Monday, I've been revisiting the transcription of
> our last call, this thread, and Richard's anti-pattern thread. I want
> to try to summarize where I see some tension here, as well as what our
> next steps might be. First, thank you all for these conversations, and
> especially to Richard & Aaron for getting them started.
>
> I think Aaron is correct in his initial message that Karen's &
> Antoine's comments are at the heart of this issue. I would add to that
> Jon's point that an SES is a class of literals & associated rules that
> describe a mapping between strings and resources. It's those rules
> that are important.
>
> I think the tension is between a group that is really focused on hard
> & fast definitions & principles of the Semantic Web and RDF running up
> against a group of us that are trying to find a slow, pragmatic middle
> path from existing applications toward those idealized cases of well
> structured, fully-URI'd SemWeb data. Both groups are correct, though.
> I completely agree with Antoine, Aaron & others that our current
> approaches are not best practices for good Linked Open Data, but I
> also know that I *want* to move my applications and data in that
> direction iteratively, even if it's a slow process, and I think DCAM /
> DCAP / DCDSP can help with that.
>
> I think that this strings vs. things thread is full of great examples
> of that. Sure, we could just say, "My application profile has one
> property: charper:hasEAD, which has a range of 'EAD file' and a
> corresponding set of decoding instructions of 'Link to ead.xsd.'" Or a
> MARC AP could point to a rubymarc, pyMarc & MARC.pm and say "I'm
> done".
>
> Those would definitely be worst practices, but I think that DCAM's
> notion of an SES should technically allow it. This creates a middle
> ground for doing things like the ISBD aggregated statements (ideally
> with a working reference implementation of an ISBD publication area
> parser that would know how to make sense out of that string and
> produce something useful to an application. Which we still have to
> make clear are *not* best practice. They're but a tiny, tiny, tiny
> step.
>
> Personally, I really want a way to have support and standardization
> for practice that sits *somewhere* between the idealized approach of
> RDF and the extreme approach of my straw man argument examples above.
> Because I want DCAM and it's associated child-specs to be a system of
> guidelines for folks who manage *horrible*, semantically opaque and
> poorly modeled data to make it *incrementally* better
> step-by-painful-step.
>
> So, when Karen asked Dan:
>>>>> On 6/24/12 9:32 AM, Dan Matei wrote:
>>>>> [URI(Beatles)] [URI(hasAppelation)]
>>>>> [<name><nosort>The</nosort>Beatles</name>]
>>> Sorry, my question was pretty vague. I mean:
>>> how is a consuming program to know if this is a literal or an SES?
>
> The answer is, because the AP makes that explicit, and that AP's URI
> comes with explicit, machine readable instructions & reference code
> for decoding and mapping that SES to *something*. That something could
> be specific to an implementation. It could be idiomatic JSON (with or
> without a corresponding class of object in some programming language,
> or RDF-XML, or a SOLR schema, or...
>
> Sorry for the length of this email. One more coming in reply to the
> Anti-patterns, then will send an agenda for Monday.
>
> -Corey
>
> On Thu, Jul 12, 2012 at 11:21 AM, Richard Urban
> <[log in to unmask]> wrote:
>> Hi all,
>>
>> Looping back around to this…
>>
>>
>> On Jun 26, 2012, at 9:04 AM, Karen Coyle wrote:
>>
>>
>> We'll need to wait for Jon and Gordon to weigh in, but I know that Jon has
>> been at a conference and may be in the midst of lengthy travels. However,
>> they have indeed created a number of SES's that are not "formal" datatypes
>> in the sense you mean, both in RDA in RDF and ISBD in RDF. It's easier to
>> see in the latter because each ISBD area is treated as an SES:
>>
>> http://metadataregistry.org/schemaprop/show/id/2135.html
>>
>> You can see how these appear in the Description Set Profile for ISBD:
>>
>> http://wiki.dublincore.org/index.php/DCAM_Revision_ISBD_DSP
>>
>> While these have been declared as SES's, I don't believe that they are
>> actionable at the moment, in the sense that I don't know of a "mapping rule"
>> for the declared SES's. Nor is it clear to me the relationship of the
>> declared SES and the ISBD RDF properties for that area. So let's hope Jon's
>> travels go well and he arrives refreshed ;-).
>>
>>
>> Here may be where I am confused. According to the current DCAM
>> documentation, a "syntax encoding scheme" **IS** an RDF Datatype
>> (http://www.w3.org/2000/01/rdf-schema#Datatype). And simply declaring that
>> ISBD *IS* an SES (or particular ares are), doesn't necessarily make it so.
>> RDF defers to the XML Schema specifications for defining a preset list of
>> datatypes, but also provides the criteria for extending it to include new
>> datatypes
>> (http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#datatype-components).
>>
>> Again, the one RDF datatype that seems similar is the XMLLiteral, which
>> provides these definitions for XML here:
>> http://www.w3.org/TR/rdf-concepts/#section-XMLLiteral
>>
>> Can we provide a similar definition for ISBD, assuming the work that ISBD
>> AP group does produce a model of IBSD that would be equivalent to the XML
>> model. (this needs some more investigation, IMHO).
>>
>> Lexical Space
>> is the set of all strings
>>
>> which are self-contained ISBD content (??)
>> (ISBD does not indicate a character encoding space, I presume this is
>> deferred to the MARC format for electronic records)
>> for which the embedding between an arbitrary ISBD tag (??) yields a document
>> conforming to the ISBD namespace (???)
>>
>> The value space
>> is a set of entities, called ISBD values, which is:
>> disjoint from the lexical space;
>> disjoint from the value space of any XML schema datatype (Does ISBD have
>> it's own sub-data types, i.e. dates. enumerated list of abbreviations,
>> etc.??)
>> and in 1:1 correspondence with the lexical space
>>
>> The lexical-to-value mapping
>> is a one-to-one mapping from the lexical space onto the value space…
>>
>> This is just a quick paraphrasing of the XML Literal documentation, I have
>> not thoroughly tested whether this is actually a valid definition of an ISBD
>> Literal datatype.
>>
>> If this is *not* what we want to do to accommodate ISBD here, it may be
>> necessary to relax the DCAM's specification of what an SES is (i.e. remove
>> the requirement that an SES isA RDF Datatype).
>>
>> Richard
>>
>>
>>
>>
>>
>>
>
>
>
> --
> Corey A Harper
> Metadata Services Librarian
> New York University Libraries
> 20 Cooper Square, 3rd Floor
> New York, NY 10003-7112
> 212.998.2479
> [log in to unmask]
>
|