Hi Karen,
If you could create a cataloger scenario that uses publication
statement, or edition statement, or place and date of capture, then we
could see how this plays out in the data. Might help to make these
discussions more concrete. An example is worth a thousand words.
Having said that, here's a thousand words to add to the confusion... :)
I find the UML notions of meta-models and model transformations very
helpful in discussions such as these. What we are doing is designing a
model transformation, using one model as input (the RDA elements, as
defined by RDA) to generate another model as output (an RDF schema).
One possible source of confusion is that the RDA elements uses a
different meta-model from RDF's meta-model. When you starting mixing
meta-models in the same sentence, you have to be very careful not to
end up with vegetable soup. I always try to keep different meta-models
separate in my comments and prose, and try to make clear which
meta-model I'm using in each sentence or paragraph.
I.e. the RDA meta-model talks about "elements", "element sub-types" and
"sub-elements".
The RDF(S) meta-model talks about "properties", "classes",
"sub-properties", "sub-classes", "domain", "range".
What Karen is discussing are the *rules* we use to transform the from
RDA meta-model to the RDF(S) meta-model. I don't think we've stated
these rules formally yet, maybe we should. (But it is good at least
that we are trying to develop some rules at all. I.e. I think it is
good to attempt a *principled* (i.e. systematic) transformation,
rather than a completely ad hoc one.)
Anyway, the rules Karen has developed so far go something like this (I
think):
1. For each element in the RDA model, generate a property in the RDF schema
2. For each element sub-type association between two elements in the
RDA model, generate a sub-property association between the two
corresponding properties in the RDF schema
3. For each sub-element association between two elements in the RDA
model, generate a sub-property association between the two
corresponding properties in the RDF schema
Rules 1 and 2 seem reasonable to me. I think rule 3 is an error.
Thinking in terms of a model transformation also helps me to keep the
input (the RDA model, defined in terms of elements, sub-elements,
element sub-types etc.) separate in my head from the output (our RDF
schema, defined in terms of classes, properties etc.).
We might hope to generate one RDF property for each RDA element, and
so end up with an output that is very similar in appearance (at a
glance) to the input, but I don't think we need to stick religiously
to that. As long as we stick to a principled transformation from RDA
to our RDF schema, then I think we can say we are being true to RDA --
i.e. as long as we can state the *rules* we've used to effect the
transformation, then we are being true to RDA; if we cannot state
those rules, then we are being ad hoc, and we are no longer true to
RDA (because we have added some intellectual content).
Another good thing about the notion of model transformation is that
you can talk about a model transformation in abstract terms (as I have
done in points 1, 2 and 3 above), but you can also implement a model
transformation concretely as some sort of program. This is exactly
what I did at the first DCMI/RDA meeting -- I implemented a model
transformation as an XSL stylesheet, using Tom Delsey's excel
spreadsheet describing the RDA model (saved as XML) as input,
generating an RDF schema (as RDF/XML) as output.
In that transformation (see [1]) I implemented rules 1 and 2 as above,
but I did something different for RDA sub-elements. In fact I assumed
an n-ary pattern in the output, although I stopped short of making
this explicit in the output (I was rather pressed for time :). If I
had more time, I would have done something like:
3. For each RDA element X which has sub-elements Y, Z, ... generate
an RDFS class C whose name is based on X, state the range of X as C,
and state the domain of Y, Z, ... as C.
This is exactly what Mikael is suggesting we do, I believe.
E.g. for RDA element "publicationStatement" and sub-elements
"placeOfPublication", "dateOfPublication", I would generate the
following RDF schema fragment:
rda:publicationStatement rdf:type rdf:Property .
rda:placeOfPublication rdf:type rdf:Property .
rda:dateOfPublication rdf:type rdf:Property .
rda:PublicationStatement rdf:type rdfs:Class .
rda:publicationStatement rdfs:range rda:PublicationStatement .
rda:dateOfPublication rdfs:domain rda:PublicationStatement .
rda:placeOfPublication rdfs:domain rda:PublicationStatement .
... which is intended for use in data as e.g.:
ex:myExampleExpression rdf:type frbr:Expression ;
rda:publicationStatement [
rda:placeOfPublication ex:NewYork ;
rda:dateOfPublication "2007"^^xsd:year ;
]
.
Cheers,
Alistair
[1] http://isegserv.itd.rl.ac.uk/cvs-public/rda/trans/raw-table-proc.xsl?rev=1.1;content-type=text%2Fplain
On Fri, Jan 02, 2009 at 05:15:21PM -0800, Karen Coyle wrote:
> On Fri, Jan 2, 2009 at 4:31 PM, Jonathan Rochkind <[log in to unmask]> wrote:
> > You have the urge to define classes in order to avoid duplicating
> > definitions?
> >
> > Becuase RDA duplicates definitions in different parts of itself, defining
> > differnet things the exact same way in multiple places, instead of including
> > the definition once and then refering to it multiple times?
>
> No, that was just a side matter, but one that could come back to bite
> us when we try to work with the data in other situations.
>
> The real issue is that the RDA element list includes some things like
> "publication statement" and "edition statement" that themselves are
> never used as properties in library bibliographic data -- they are
> wrappers around a set of data elements (essentially the 260 and 250 in
> MARC). Since we are creating a registry of properties (we don't
> currently seem to have a way to define classes, but we do have
> properties and sub-properties), it's hard to know what to do with
> these wrappers. The RDA list calls them elements with sub-elements,
> but that's not an RDF concept. Are they classes?
>
> To me the main question is whether we paint ourselves into a corner by
> creating what is essentially a hierarchical relationship between
> something like "publication statement" and place of
> publication/publisher/date of publication. Is that a fixed
> relationship that will always be true? Or could there be other uses
> for those properties?
>
> I'm not worried about the property/sub-property relationship between
> title and key title or abbreviated title -- those seem to me like
> narrowed definitions in the spirit of the DC terms 'refinements.' But
> there are some groups, like Series statement, that seem to be a mix of
> refinements and parts.
>
> I think we're going to have to go through the whole list and figure
> out what works. The RDA element list doesn't follow RDF, so we're
> having to do interpretation here. My goal is maximum flexibility for
> future applications.
>
> The other factor is that we are currently trying for one-to-one with
> the RDA online product in terms of elements. That may or may not be
> possible -- this is part of that exploration.
>
> kc
>
>
> >
> > I don't see any problem with you going ahead and defining those classes in
> > order to avoid duplicating definitions, even if RDA doesn't. After all, the
> > _product_ will be the same; the thing you are defining will still be exactly
> > the same thing as RDA defines, you've just taken a step to define it more
> > economically than RDA does. But you've still defined the same thing. No?
> >
> > That seems like a fine way to go to me, it doesn't seem to me a problem that
> > RDA doesn't use that economy of definition.
> >
> > Alternately, it doesn't seem the end of the world to me to just go ahead and
> > define the different elements exactly the same way, multiple times. Not
> > usually the way we like to do things, but if your sense of adhering to RDA
> > says you should do it that way, doesn't seem like a disaster to me. Will
> > this cause problems for implementors? I don't _think_ so, I could be wrong.
> >
> > Am I the reasons this is problematic? It doesn't seem that problematic to me
> > either way? Am I missing the problem you are looking for a solution to,
> > it's not simply that you would like to economically define in one place what
> > RDA profligately defines identically in multiple places?
> >
> > Jonathan
> >
> > Karen Coyle wrote:
> >>
> >> But RDA doesn't define classes -- that's the problem. We can try to
> >> define them for RDA, but that isn't a concept in the RDA elements that
> >> we've been given. And I'm not at all sure that the aggregate elements
> >> will work as classes, for reasons I gave above. So I'm trying to find
> >> a practical solution for the moment, one that doesn't violate RDA.
> >> (Note that RDA may NOT be RDF compatible as defined in the current RDA
> >> documentation, and we need to work around that at the moment.)
> >>
> >> kc
> >>
> >> On Fri, Jan 2, 2009 at 1:54 PM, Mikael Nilsson <[log in to unmask]>
> >> wrote:
> >>
> >>>
> >>> The "right" RDF-y way to do it is to set the range of the properties to
> >>> the appropriate class.
> >>>
> >>> That way, values are known to be of the right type. Application profiles
> >>> can then decide how they want to describe instances of that class.
> >>>
> >>> Thus, there is no conceptual difference between the two kinds of
> >>> "properties" you see in RDA. The substructure is only an artifact of the
> >>> application profile.
> >>>
> >>> /Mikael
> >>>
> >>> fre 2009-01-02 klockan 13:18 -0800 skrev Karen Coyle:
> >>>
> >>>>
> >>>> Back to place and date...
> >>>>
> >>>> On Mon, Dec 22, 2008 at 9:00 AM, Alistair Miles
> >>>> <[log in to unmask]> wrote:
> >>>>
> >>>>
> >>>>>
> >>>>> Not directly related to any scenarios, I found that rda:placeOfCapture
> >>>>> is a sub-property of rda:placeAndDateOfCapture, which doesn't look
> >>>>> right. This looks like a case where Tom Delsey's "sub-elements"
> >>>>> pattern got wrongly translated to RDF sub-properties, where rather it
> >>>>> should be modelled in RDF as an n-ary relation.
> >>>>>
> >>>>
> >>>> This still leaves us with the question of what to do with these in the
> >>>> registry. I don't think there is an 'n-ary' capability, whatever that
> >>>> would look like. Also, I'm not sure that these empty properties make
> >>>> sense in the property list... I think you would want to manage them in
> >>>> an application profile. Here are some examples from RDA:
> >>>>
> >>>> Place and date of capture (empty)
> >>>> - Place of capture
> >>>> - Date of capture
> >>>>
> >>>> Publication statement (empty)
> >>>> - Place of publication
> >>>> - Parallel place of publication
> >>>> - Publisher's name
> >>>> - Parallel publisher's name
> >>>> - Date of publication
> >>>>
> >>>> If we can imagine any use of these properties OUTSIDE of the
> >>>> particular empty node, then I think they need to be separately defined
> >>>> as properties, not as dependent on the empty node. We also have the
> >>>> problem that RDA doesn't make use of classes so that there is a great
> >>>> deal of repetition in the property/element list. As an example, there
> >>>> are four different sets of elements that are the same as the
> >>>> Publication statement, but that substitute one of these words for
> >>>> Publication: Production/Publication/Distribution/Manufacture. And as
> >>>> you can see, they also share some meaning with the "capture" concept,
> >>>> in terms of place and date. I immediately want to take these and
> >>>> rationalize them by defining simple properties:
> >>>>
> >>>> - date
> >>>> - agent name
> >>>> - place
> >>>>
> >>>> ... and allowing any element to have a 'parallel' (which is the same
> >>>> value in a different language).
> >>>>
> >>>> Unfortunately, at the moment we are trying to be true to RDA's
> >>>> definition of its properties, so I need to sit on my virtual hands and
> >>>> not mess with what they have defined.
> >>>>
> >>>> Any ideas what we can do with our empty nodes/Tom's sub-elements,
> >>>> given this info? In many cases we just entered each element as a
> >>>> separate property, including the empty node one. Is there a down side
> >>>> to this solution?
> >>>>
> >>>> kc
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>> --
> >>> <[log in to unmask]>
> >>>
> >>> Varning! E-post till och från Sverige, eller som passerar servrar i
> >>> Sverige, avlyssnas av Försvarets Radioanstalt, FRA.
> >>> WARNING! E-mail to and from Sweden, or via servers in Sweden, is
> >>> monitored by the National Defence Radio Establishment.
> >>>
> >>>
> >>
> >>
> >>
> >>
> >
> > --
> > Jonathan Rochkind
> > Digital Services Software Engineer
> > The Sheridan Libraries
> > Johns Hopkins University
> > 410.516.8886 rochkind (at) jhu.edu
> >
>
>
>
> --
> -- ---
> Karen Coyle / Digital Library Consultant
> [log in to unmask] http://www.kcoyle.net
> ph.: 510-540-7596 skype: kcoylenet
> mo.: 510-435-8234
> ------------------------------------
--
Alistair Miles
Senior Computing Officer
Image Bioinformatics Research Group
Department of Zoology
The Tinbergen Building
University of Oxford
South Parks Road
Oxford
OX1 3PS
United Kingdom
Web: http://purl.org/net/aliman
Email: [log in to unmask]
Tel: +44 (0)1865 281993
|