Quoting Rachel Heery <[log in to unmask]>:
> within the MARC data model and MARC records
> the relator terms do not act as 'properties' as I understand it - the
> terms have a different role in MARC records than within DC records.
Yes.
> This seems to make declaring terms as RDF properties something of a
> formality - as long as the maintainer or 'owner' of data element sets is
> willing to declare a particular sub-set of terms as RDF properties then
> that is ok...
I think it is much more than a "formality", and personally I think it is
dangerous to think in terms of "(re)declaring" a (sub-)set of existing "terms"
as properties. If a "term" is a component in a hierarchical data structure then
that is what it is; that same "term" can not also be a property. e.g. an XML
element is not an RDF property (not even in RDF/XML).
I think this is what you are getting at in the first of your criteria below, but
I guess I just want to stress that it is problematic to go in search of
similarity where there are fundamental differences.
The work that has to be done is to consider how the _information_ represented
within the hierarchical data structure is to be represented within a
triple/statement-based model. There may be no simple one-to-one correspondence
between the components of the hierarchical data structure and the components of
the statement-based model.
Mikael Nilsson's paper(s) on the LOM RDF binding e.g.
http://rubens.cs.kuleuven.ac.be:8989/ariadne/CONF2003/papers/MIK2003.pdf
give an excellent account of this process for the case of the LOM. And
emphasises that the translation must be done by looking at each component of
the hierarchical model in turn
===
The container-based metamodel used by LOM is thus
not compatible with the metamodel used by Dublin
Core. When does this matter? Binding LOM to RDF is
the obvious example in this context, as the metamodel
of RDF is based on a property-value model and not containment.
In general, it leads to difficulties when trying
to combine terms from two metadata standards into the
same system. When the metamodels are compatible,
such a combination or mapping can be realized by simply
translating the metamodel contructs. If the metamodels
are incompatible, the translation must be done
on an idiosyncratic, element-by-element basis.
===
In Mikael's mapping, some LOM data elements are modelled as RDF properties - but
the property and the LOM data element are still two different types of thing. In
some cases two different LOM data elements are modelled using the same RDF
property (describing two different resources). In other cases what are data
element _values_ in LOM are modelled as RDF properties (e.g. the case of LOM
Relation.Role); in other cases, there is quite substantial re-modeling required
(e.g. the case of LOM Classification)
> In my view the criteria for re-use of terms should be something like:
>
> "First, are the semantics and context of a term in one metadata format
> sufficiently similar to the semantics and context of the property I want
> to express in a DC description? if so can this term be usefully used in
> 'isolation' within a DC description out of the context of its original
> format?
>
> Second, are the 'owners' of the terms willing to co-operate?"
>
> If the answer to both of the above is yes, then declaring those terms as
> RDF properties may well be achievable. Especially if, as I understand has
> happened with MARC relator terms, just the sub-set of terms required from
> the 'other' format based on a different data model need to be declared??
>
> Maybe worth thinking about that old saying 'everything can be solved by a
> level of indirection'.... not knowing much about MODS, but could a sub-set
> of MODS terms be 'separated out' of MODS and declared as RDF properties?
If MODS terms are components in a hierarchical data model, then those terms can
not also be properties, IMHO. What has to happen is the sort of mapping between
the models which Mikael describes for the LOM, and that can only be done by
looking at the information represented by MODS data structures.
In effect this is the process that has taken place for the MARC relator codes,
but it was a fairly trivial case, as by definition they represent types of
relationship (between a resource and an agent) and fit neatly into the binary
relation model of RDF. It's still taken an awfully long time though!
> In my view we should be looking for solutions to help us meet requirements
> of several user communities, and to move forward as regards the evolution
> of data element sets by allowing re-use of data elements. If this can be
> done by declaring sets of terms in RDFS then good....
But reuse has to happen within a consistent, coherent framework. The analogy I
think I used at one point was Meccano parts and Lego bricks: I can build nice
things with Meccano and I can build nice things with Lego.
But no matter how desperately I might want to reuse my nice funky bit of my
Meccano spaceship in my Lego submarine, it wasn't designed to fit. If we try to
encourage reuse regardless we'll end up with our submarines leaking and the nose
cones falling off our spaceships.
Having said all this, and at the risk of sowing vile heresy....
... increasingly I do have more fundamental misgivings about the way we in DC
have tended to approach this notion of "reuse".
In the RDF/DC triple/statement based model, properties and classes are defined
as more or less independent stand-alone entities. Yes, we assert relationships
between resources (subproperty, subclass etc) but I can use a URIref like
http://purl.org/dc/elements/1.1/title to denote the concept of "having a title"
quite independently from that of having a subject, identifier etc etc etc.
However, in XML-based applications like MODS, the component parts of the data
structure do not have the same sort of independence/free-standing nature. MODS
is an XML language or format, and the way individual components (XML elements,
XML attributes) within MODS are interpreted is conditioned by their structural
relationships with other components (containment relations, element/attribute
relations etc) as defined by the rules of that XML language.
Now yes, if MODS had been developed as an RDF application, using a triple-based
model, or if a full MODS RDF mapping was developed in the way that the LOM RDF
mapping was developed, then the classes and properties would be available for
use in DC metadata descriptions, and we could establish useful relations
between DC properties and MODS properties and so on.
But the approach of "cherry-picking" particular parts of MODS and mapping only
those particular bits to the RDF model, just because those particular bits of
MODS _appear_ to be similar to something we might want to express in a DC
description, and because we have the notion that reuse is an absolute, seems...
well... it all starts to seem a bit bizarre, really!
What are we really achieving by doing this?
In the absence of a MODS RDF binding, what is anyone gaining by asking LoC to
define two or three RDF properties called
http://www.loc.gov/mods/location
(and the other two or three things needed for the DC Lib AP - I've just guessed
the URIrefs) picked pretty much from random parts of the MODS data structure.
It provides _no_ interoperability whatsoever between DC and MODS XML because
we've just picked out some tiny part of the MODS data structure.
Why are we _insisting_ on "reuse" in this rather odd piecemeal sort of way,
instead of simply declaring the properties required within DCMI vocabularies?
Pete
-------
Pete Johnston
Research Officer (Interoperability)
UKOLN, University of Bath, Bath BA2 7AY, UK
tel: +44 (0)1225 383619 fax: +44 (0)1225 386838
mailto:[log in to unmask]
http://www.ukoln.ac.uk/ukoln/staff/p.johnston/
|