In terms of best practices/design patterns, it sounds like a tiered
approach might make sense. From "best" to "worst":
1. Re-use an existing URI from a data-rich, well networked, and
trustworthy source
2. Mint your own URI
3. Create a clear, human-readable label
4. Shove whatever you have in a string
I think #4 and even #3 is where the use of complex datatypes can come
in handy, though for #3 I would be reluctant to encourage people to
create a label that is so complex it would need decoding instructions.
Aaron
On Fri, Jun 22, 2012 at 2:20 AM, Antoine Isaac <[log in to unmask]> wrote:
> Hi Karen,
>
> And in your example, you can also treat the label as a simple one. Not as
> some super-complex (and/or mysterious) manifestation of a syntax encoding
> scheme.
> Don't take me wrong, I have nothing against labels per se. I am just
> reluctant to put too much semantics in them.
>
> Antoine
>
>
>
>> You could treat it as a blank node and then assign the label to that the
>> same way you would if an identifier was present.
>>
>> Jeff
>>
>>> -----Original Message-----
>>> From: DCMI Architecture Forum [mailto:[log in to unmask]]
>>> On Behalf Of Karen Coyle
>>> Sent: Thursday, June 21, 2012 8:13 PM
>>> To: [log in to unmask]
>>> Subject: Re: Limiting the use of string in DCAM design patterns
>>>
>>> Aaron, I was just today on another discussion where the "strings v.
>>> things" came up. AS is so often the case, I think the answer here will
>>> have to be: "it depends."
>>>
>>> Here's the use case that was discussed today:
>>>
>>> You are creating the metadata for a book. You may find a URI
>>> representing the author, and perhaps the subject headings. But assume
>>> that at the moment you have no idea if there is a URI for some other
>>> element -- in our use case, the element was the name of a series. As a
>>> practical matter, your choices are:
>>>
>>> 1) spend minutes or hours combing the web for a URI
>>> 2) mint a URI for the series, which assumes a) that you have the
>>> technology at hand to create a useful entity for the series and that b)
>>> others will be able to resolve to it to find out if it is the same
>>> series they are encountering in their metadata creation event
>>> 3) give the name of series as a literal, hoping that in the future this
>>> series will be well-defined and identified.
>>>
>>> I actually think that in many instances of metadata creation, #3 is the
>>> only practical route to take, and the only one likely to be taken by
>>> many because it has the least friction. By dis-allowing literals, you
>>> may hinder the creation of metadata. (Undoubtedly this is what
>>> motivated
>>> schema.org, and is also what motivated DC at its origins.)
>>>
>>> The dilemma, in my mind, is how to manage this "either/or" -- either
>>> strings or things -- in the DCAM model.
>>>
>>> kc
>>>
>>>
>>>
>>> On 6/21/12 11:33 AM, Aaron Rubinstein wrote:
>>>>
>>>> I was reading over the report from the 6/8 DCAM telecon and there was
>>>
>>> one discussion in particular that struck me and I think might be worth
>>> a bit more thought.
>>>>
>>>>
>>>> Surrounding the discussion about the ISBD example and integrating
>>>
>>> SESes into DCAM, Antoine made this statement:
>>>>
>>>>
>>>> "Antoine: RDF is about encoding as little information in the string
>>>
>>> as
>>>>
>>>> possible. That's why datatypes are not used much. I don't think DCAM
>>>
>>> should
>>>>
>>>> have a different approach."
>>>>
>>>> Followed a little later with this from Karen:
>>>>
>>>> "Karen: In every case where you have multiple things, but the whole
>>>
>>> can repeat.
>>>>
>>>> You can have multiple titles with multiple subtitles. A lot of this
>>>
>>> stuff goes
>>>>
>>>> away when we use identifiers for things, but not all. A lot of what
>>>
>>> we have
>>>>
>>>> should be replaced with URIs."
>>>>
>>>> I think both of these points are very interesting from the
>>>
>>> perspective of how DCAM might contribute to a best practices for
>>> metadata in use. We obviously need to consider the manipulation and
>>> ultimately conversion of legacy data but I think we should also be
>>> designing the DCAM toward a best practices approach as well.
>>>>
>>>>
>>>> To that end, I'd be interested in hearing what others think about
>>>
>>> attempting to limit a reliance on strings in our DCAM design patterns
>>> in favor of URIs or other flavors of identifiers.
>>>>
>>>>
>>>> FWIW, I am in agreement with Antoine here.
>>>>
>>>> Aaron
>>>>
>>>
>>> --
>>> Karen Coyle
>>> [log in to unmask] http://kcoyle.net
>>> ph: 1-510-540-7596
>>> m: 1-510-435-8234
>>> skype: kcoylenet
>>
>>
>
|