JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for DC-RDA Archives


DC-RDA Archives

DC-RDA Archives


DC-RDA@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

DC-RDA Home

DC-RDA Home

DC-RDA  March 2008

DC-RDA March 2008

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: A possible strategy for our literals/non-literals conundrum ...

From:

Karen Coyle <[log in to unmask]>

Reply-To:

List for discussion on Resource Description and Access (RDA)

Date:

Tue, 25 Mar 2008 10:58:40 -0700

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (211 lines)

Thanks, Rob. When I think of "typed" literals I think of things like 
date or currency. You don't know what the value will be, but you know 
its format and its range. Is it possible to create a type that has, for 
example, an integer plus a term from a controlled vocabulary? Would that 
be any different from having a numeric amount plus a currency type? (And 
then I've still got the question of whether this is semantically the 
same as the RDA value.... but I'm willing to pretend that it is, for the 
sake of argument.)

I think we've all spent cycles writing algorithms to parse bits of 
bibliographic data, like hunting for "number of pages" and "page 
numbers" in differently created metadata. I shudder to think that our 
future consists of a huge transform of library data (that will be about 
95% successful and 5% unholy mess). I think I'll just have to banish 
that thought from my head and more forward in ignorant bliss. ;-)

kc

Rob Styles wrote:
> Karen,
> 
> in the case of duration (and probably many others) maybe the tool you 
> need in your arsenal is the Typed Literal: 
> http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-typed-literal
> 
> Obviously translating the freetext fields from marc will be difficult, 
> unreliable and frustrating, but a typed literal would allow queries such 
> as " Find me tracks containing a section of at least 2 minutes and 10 
> seconds which is in E minor" [1]
> 
> Typed Literals basically allow for the normalisation of structured data 
> without requiring an extra resource.
> 
> rob
> 
> 
> 
> [1] 
> http://www.omras2.com/cgi-sys/cgiwrap/musicstr/view/Main/OntologyQueryExamples 
> 
> 
> 
> On 25 Mar 2008, at 14:32, Karen Coyle wrote:
>> Tom, I can't thank you enough for this thorough, clear explanation 
>> (which I will keep and re-read whenever the confusion strikes). You 
>> have confirmed what I suspected, which is that if we wish to model RDA 
>> as non-literals with URIs, it will require, in many cases, a different 
>> value to what we have today in RDA.
>>
>> But here's one more question, and I think this gets to what Jon was 
>> asking:
>>
>> if we define "RDA duration" as a non-literal, whether the value is 
>> represented by a URI or a value string, its semantics are defined in 
>> the property "RDA duration." RDA duration is currently a single text 
>> string with some rules.
>>
>> If we want to define a structured expression of duration (unit type 
>> plus measure) would that have to be a separate property? In other 
>> words, would that structuring of the value be a significant semantic 
>> change such that it would no longer be defined by the RDA property 
>> definition? (And I don't think that structuring such as the unit and 
>> its measurements are separate properties would follow the definition 
>> for sub-properties; e.g. they wouldn't "dumb down" to the broader 
>> definition of property with units and measures included).
>>
>> I'm not sure we can answer this yet because we haven't done enough 
>> detailed analysis of the properties themselves. For example, the 
>> properties for persons are considerably different in their nature than 
>> the properties for titles. But I would really like for folks on this 
>> list to look at the properties (and ask for more examples if those 
>> would be helpful) so that we can figure this out.
>>
>> My own gut feeling is that there will be some values that could be 
>> represented by URIs (or strings), such as names (personal, corporate) 
>> and others that are unlikely to be represented by URIs (notes, 
>> description). That doesn't mean we can't define them all as 
>> non-literals, but in that case the "non-literal" designation is just a 
>> technicality. What we have to convey to users of the properties (e.g. 
>> those creation application profiles) is the nature of the semantics of 
>> the property.
>>
>> NOTE: We've used this "27 min." as an example here, and it seems 
>> intuitive to think of it as unit=min, duration=27. But in fact the 
>> strings can be more complex, such as "approximately 1 hr., 10 min." or 
>> for extent, "24 pages, 12 pages of plates"; "2400 frames of still 
>> images and 80 min. of moving images." So although there is guidance, 
>> these are free text.
>>
>> kc
>>
>> Thomas Baker wrote:
>>> Hi Karen,
>>>> As these stand, could they be represented as non-literals? At the 
>>>> moment they are purely text strings, and I think the question is how 
>>>> we can work with them since they do not have any further structure.
>>> Taking just one of the examples at random...
>>>> Property: duration
>>>> data: "27 min."
>>> There are two ways to express this in RDF:
>>> 1. If rda:duration were defined with a literal range:
>>>    R rda:duration "27 min." .
>>> 2. If rda:duration were defined with a non-literal range:
>>>    R rda:duration _:x .
>>>    _:x rdf:value "27 min." .
>>> In each case, "27 min." is the Value String.  The "x" could be one of 
>>> the following:
>>>   a. a blank node
>>>      b. a deliberately assigned URI, for example a member of       a 
>>> hypothetical Vocabulary Encoding Scheme for durations       (not that 
>>> this would necessarily be a good idea!)
>>>      c. a unique URI automatically generated by software in order 
>>> to       make it a "named node", which is easier to process than 
>>> a       blank node.
>>> Of the three options, "a" is controversial, as Jon points out
>>> (citing Ian Davis's blog), option "b" would take extra work
>>> (perhaps unnecessarily), and "c" can straightforwardly be
>>> automated.
>>> But I understand your real concern here to be that the things
>>> represented by simple string values in cataloging rules and
>>> in countless legacy data sets have not been formally modeled
>>> -- i.e., "they do not have any further structure".  Indeed,
>>> one COULD use a sophisticated model for describing durations,
>>> with separate binary relations for hours, minutes, and seconds
>>> (e.g., see [2]) -- perhaps the sort of "structured" model you
>>> have in mind.  And you do not want to do that. You just want
>>> to use the string "27 min.".  And this is fine.
>>> The point is that the string "27 min." has a different
>>> function in the model depending on whether rda:duration is
>>> defined with a range of literal or non-literal.
>>> In the former case -- where rda:duration has a range of
>>> literal ("string") -- statements using rda:duration have
>>> literals directly as objects.  The term dcterms:date [3]
>>> is a good example of a term with a range of literal, and
>>> an example value is "2008-03-25".
>>> The problem is that literals cannot themselves be the subject
>>> of further triples, so defining rda:duration this way means
>>> that this property could be used for more "structured"
>>> duration descriptions, with separate properties for hours
>>> and minutes or whatever. One would forever be locked into
>>> expressing durations as literals.  (This may be a reasonable option 
>>> in the case of rda:duration,
>>> but one would need to consider the consequences.  In assigning
>>> a literal range to dcterms:date, the Usage Board considered
>>> that the overwhelming majority of implementations use date
>>> with literals, often with a datatype or Syntax Encoding
>>> Scheme such as the W3C Date and Time Formats specification.
>>> In consequence, though, if an application were to have a
>>> requirement to represent dates using a complex model with
>>> multiple properties, dcterms:date would not be the right
>>> choice and one would need either to find an alternative date
>>> property or coin a new one.)
>>> In the latter case -- rda:duration is defined with a
>>> non-literal range -- one allows for expressions of duration
>>> that are potentially more complex than just a literal.
>>> Using rda:duration with non-literal range, duration could
>>> be modeled in application profiles with multiple properties
>>> and the like.  Remember that one of the properties of that non-literal
>>> resource -- in many cases the only one needed -- can always
>>> be rdf:value, pointing a literal like "27 min.".
>>> So to summarize, the fact that a duration will be represented
>>> using a literal does not mean that rda:duration needs to have
>>> a literal range.
>>> And it is important not to confuse the literal/non-literal
>>> issue with the issue of serialization formats.  The example
>>> above could in principle be serialized in a very simple XML
>>> format with
>>>    <duration>27 min.</duration>
>>> and this could still correspond to the following non-literal
>>> representation in RDF:
>>>    R rda:duration _:x .
>>>    _:x rdf:value "27 min." .
>>> as long as the definition of the format were to make clear
>>> that duration is intended to represent a non-literal and the
>>> mapping to a correct RDF triple representation were encoded
>>> in a GRDDL transform (or similar sort of conversion algorithm).
>>> Tom
>>> [1] http://iandavis.com/blog/2007/03/bnodes-out
>>> [2] http://www.w3.org/TR/owl-time/#duration
>>> [3] http://dublincore.org/documents/dcmi-terms/#terms-date
>>
>> -- 
>> -----------------------------------
>> Karen Coyle / Digital Library Consultant
>> [log in to unmask] http://www.kcoyle.net
>> ph.: 510-540-7596   skype: kcoylenet
>> fx.: 510-848-3913
>> mo.: 510-435-8234
>> ------------------------------------
> 
> Rob Styles
> Programme Manager, Data Services, Talis
> tel: +44 (0)870 400 5000
> fax: +44 (0)870 400 5001
> direct: +44 (0)870 400 5004
> mobile: +44 (0)7971 475 257
> msn: [log in to unmask]
> blog: http://www.dynamicorange.com/blog/
> irc: irc.freenode.net/mmmmmrob,isnick
> 
> 

-- 
-----------------------------------
Karen Coyle / Digital Library Consultant
[log in to unmask] http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234
------------------------------------

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

May 2021
April 2021
February 2021
November 2020
September 2020
August 2020
July 2020
June 2020
March 2020
February 2020
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
April 2019
February 2019
December 2018
September 2018
July 2018
June 2018
April 2018
December 2017
November 2017
June 2017
December 2016
October 2016
September 2016
August 2016
July 2016
May 2016
April 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
August 2012
July 2012
May 2012
April 2012
March 2012
February 2012
January 2012
October 2011
September 2011
August 2011
June 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
June 2010
February 2010
January 2010
December 2009
November 2009
October 2009
June 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
August 2007
July 2007
June 2007
May 2007
April 2006
February 2006
January 2006
December 2005


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager