tis 2008-03-25 klockan 18:19 -0400 skrev Jonathan Rochkind:
> For most fields in AACR2, I doubt you could write a nice neat syntax
> pattern like that that would actually cover the full range of legal
> values. There are rules for how to write those statements yes, but
> you're not going to find a BNF, and it may or may not be possible to
> write a syntax pattern that capture the full range of legal values under
> the narrative rules (which often allow quite a bit of flexibility). And
> even if you did, you will find that _many_ existing data elements don't
> match your syntax pattern. Just saying, beware. That's what I've found
> with serial coverage stuff, and I've spent far more time then I would
> like trying to unsuccesfully find a way around it.
I can imagine :-). I'm not saying this is how it must be done, I'm just
laying out the options....
>
> For data going forward rather than legacy data, you could probably do
> that, if the RDA rules supported it. But if you're going to do that, why
> not go full hog and have actual structured data instead of just a syntax
> pattern for a string?
Data ends up being strings eventually - somewhere it has to say "29".
In what precise syntactic context the "29" figure appears can depend.
Will it be an integer value of the rda:minutes property? Will it be "29
min" accoring to my datatype, or will we simple use ISO 8601: "PT29M"?
All of those are acceptable as structured data, IMHO.
/Mikael
>
> Mikael Nilsson wrote:
> > tis 2008-03-25 klockan 13:29 +0100 skrev Thomas Baker:
> >
> >>> Property: duration
> >>> data: "27 min."
> >>>
> >> There are two ways to express this in RDF:
> >>
> >> 1. If rda:duration were defined with a literal range:
> >>
> >> R rda:duration "27 min." .
> >>
> >> 2. If rda:duration were defined with a non-literal range:
> >>
> >> R rda:duration _:x .
> >> _:x rdf:value "27 min." .
> >>
> >
> > Insert Rob's typed literal as case 3 here:
> >
> > 3. If rda:duration were defined with a typed literal range
> >
> > R rda:duration "27 min"^^rda:DurationType
> >
> > where rda:DurationType is an RDF Datatype, and would be, more or less,
> > specified using a syntax pattern, such as
> >
> > [0-9]+[ ]?(h|min|sec)
> >
> > (allowing for integer hours, minutes, or seconds, such as "2h", "29 min"
> > and "3420sec" etc). Each valid literal instance of this pattern needs to
> > be given an interpretation (= "value" in the "value space"), in this
> > case something like "an interval of time, measured in whole seconds"
> >
> > Note that the pattern and the interpretation needs to be predefined -
> > there is no room for extensibility in typed literals.
> >
> > A second comments has to do with what the "rules" say. I'd like us to be
> > *extremely* careful to make sure we draw the line between what's in a
> > property definition, and what is part of an application profile.
> >
> > For example, assume we choose pattern 2 above. The property definition
> > would be something along the lines of
> >
> > URI: rda:duration
> > Label: Duration
> > Definition: The duration of a resource
> > Range rda:Duration
> >
> > where rda:Duration is the class of all durations (compare, for example,
> > http://dublincore.org/documents/dcmi-terms/#terms-accrualPeriodicity).
> >
> > It is up to the class rda:Duration to define how Durations are
> > represented.
> >
> > Now, with this definition, all these patterns are perfectly ok:
> >
> >
> > 1. URI for an instance of rda:Duration:
> >
> > R rda:duration <http://example.org/Durations/234451>
> >
> >
> > 2. Blank node with rdf:value:
> >
> > R rda:duration _:x
> > _:x rdf:value "29 min"
> >
> > 3. Blank node with other properties
> >
> > R rda:duration _:x
> > _:x rda:hours "0"^^xsd:integer
> > _:x rda:minutes "29"^^xsd:integer
> >
> > etc etc.
> >
> > However, the RDA rules may say: "This property is to be used with NO
> > URI, and with a *single* rdf:value property (=value string) containing a
> > string formatted according to.....". This will make only case 2
> > acceptable.
> >
> > This rule is best made part of an application profile. As long as the
> > properties and classes are well-defined, they will still allow for
> > application profiles that say "Use one of the following URIs for
> > durations: ..." or "NEVER use blank nodes" or "Use the following more
> > precise properties of the rda:Duration object", etc etc.
> >
> > So, the question we need to ask ourselves is "what different kinds of
> > application profiles do we want to enable?". Tom touched the issue when
> > he mentioned dcterms:date - it was decided that application profiles
> > using this property must limit themselves to literal values, in the
> > interest of increased interoperability. For many other properties, the
> > choice was the reverse. And for one, dcterms:title, the choice was (is?)
> > a very hard one....
> >
> >
> >
> >> The "x" could be one of the following:
> >>
> >> a. a blank node
> >>
> >> b. a deliberately assigned URI, for example a member of
> >> a hypothetical Vocabulary Encoding Scheme for durations
> >> (not that this would necessarily be a good idea!)
> >>
> >> c. a unique URI automatically generated by software in order to
> >> make it a "named node", which is easier to process than a
> >> blank node.
> >>
> >> Of the three options, "a" is controversial, as Jon points out
> >> (citing Ian Davis's blog), option "b" would take extra work
> >> (perhaps unnecessarily), and "c" can straightforwardly be
> >> automated.
> >>
> >
> > Per my reasoning above, these choices must be left to the application
> > profile designer. We should not care about the pros and cons of blank
> > nodes here, but certain applications will care. We should only care
> > about what choices we want to *enable*.
> >
> >
> >> So to summarize, the fact that a duration will be represented
> >> using a literal does not mean that rda:duration needs to have
> >> a literal range.
> >>
> >
> > Exactly. But the above should read: "the fact that a duration will be
> > represented *in the RDA application profile* using a literal does not
> > mean that rda:duration needs to have a literal range". Maybe we want to
> > enable other application profiles that make different choices. Or maybe
> > we don't. The justification must be found elsewhere.
> >
> > We're facing a generalization issue - extracting properties from a
> > pre-existing application profile, and trying to make them as useful as
> > possible to a broader audience. That's where we need to think hard....
> > and where we need the use cases.
> >
> >
> >> And it is important not to confuse the literal/non-literal
> >> issue with the issue of serialization formats. The example
> >> above could in principle be serialized in a very simple XML
> >> format with
> >>
> >> <duration>27 min.</duration>
> >>
> >> and this could still correspond to the following non-literal
> >> representation in RDF:
> >>
> >> R rda:duration _:x .
> >> _:x rdf:value "27 min." .
> >>
> >> as long as the definition of the format were to make clear
> >> that duration is intended to represent a non-literal and the
> >> mapping to a correct RDF triple representation were encoded
> >> in a GRDDL transform (or similar sort of conversion algorithm).
> >>
> >
> > Exactly right. And if we cared only for "RDA applications", the RDF
> > variant would likely be completely uninteresting. The interesting stuff
> > happens when the generated RDF triples meet other metadata. Will it
> > blend? [1]
> >
> > /Mikael
> >
> > [1] http://www.willitblend.com/
> >
> >
>
--
<[log in to unmask]>
Plus ça change, plus c'est la même chose
|