Ralf Schimmer wrote:
>
> Listening to Misha's strong disapproval, I think it is about time to
> jump in and articulate some equally strong support for what Sigfrid
> has very vigorously put forward over the past few days.
I guess I'll add a few comments in support of David and Misha
(and myself?).
I think that the problem here is largely about separating
syntax from semantics. Work in the dc-datamodel working
group has been particularly useful in refining this,
primarily by using apparently syntax-neutral arc-node
diagrams for most recent discussions. The difficulty
of maintaining the separation has nevertheless not been
absent from the discussions there.
Taking the RELATION problem first, as the location of the
current discussion and thus providing a convenient canonical
example.
I would summarize the _intention_ of the report
(http://purl.oclc.org/metadata/dublin_core/wrelationdraft.html)
as follows:
"DC:RELATION will normally have two items of information,
a relationship type and an identifier for the related resource,
and it will usually be possible to selected a value for
the relationship type from the recommended list."
This is only semantics and deliberately avoids syntax.
Yes - there was a change from the earliest draft of the
report to the final version. The report _initially_ used
a version of the dot.syntax to express the idea, but syntax
was later removed from the report in order not to prejudice
synchronisation with ideas from ongoing work in the datamodel wg.
I do not think it was intended to suggest that the dot-syntax
which appeared initially (and persists in the draft RFC3
http://www.roads.lut.ac.uk/lists/meta2/1998/02/0002.html )
was _necessarily_ wrong at this stage, it was just that the
dot.syntax is partly implementation dependent, being particularly
suited to HTML, and the WG was really looking at semantics.
Also please note that the word "identifier" is not really a
new sub-element, it is more or less what was earlier meant
by "value", but in the context of "extended" DC, at least as
discussed in the datamodel wg, the word "value" more
generally to mean the information on any RHS of metadata
assignments, in the sense of
attribute = value
so it is more convenient to use specific names in discussion,
eg in the current context, ie
"type" = value
"identifier" = value .
Instantiations of DC in currently supported HTML are sometimes
not capable of expressing the full richness of the DC datamodel.
The absence of a explicit grouping mechanism is particularly
problematic. DC RELATION is deceptive - only having two
recommended sub-elements it is possible to hide these
within one <meta > element by appending one of the values
to the attribute name and putting the other in the atribute
value - as in the early draft and illustrated by Sigfrid.
But when you have elements which need more than two sub-elements,
this trick is not possible.
This is not to say that we should make such tricks to shoe-horn
extended DC in HTML illegal, but we probably need to be careful
when doing this.
In particular it is possible to write rules about how to
express extended DC, as described in the DC datamodel,
in a dot.syntax notation:
* element and sub-element names represent labels on the arcs
in the arc-node graph,
* values represent the contents of leaf-nodes,
* dots represent empty nodes.
Then if we find ourselves using the dot.syntax in a way
that is clearly different to this, we must look at it very
closely to decide if the gains outweigh the losses.
How to compress the information from extended DC to
basic DC is another question which I think it is possible
to write similar simple rules for.
This is the true location of the current dispute/discussion:
to re-iterate, the datamodel imputes a strict meaning to
dot.syntax expressions of extended DC, whereas constructions
such as Relation.IsBasedOn appear to be using the dot.syntax
in a distinctly different way.
We need to resolve whether we will insist on the
strict interpretation of the dot.syntax for DC in HTML -
or whether we also sanction other usages, which in some
cases are already widely deployed.
In this regard the current RFC's and draft RFC's are
suffering a synchronisation problem. In particular,
RFC3 does not take much account of the datamodel
work, and the use of the dot.syntax largely
pre-dates the strict interpretation or rules for
constructing extended dot.elements which I have
sketched above.
There is also clearly a process issue here:
Sigfrid appears to be concerned that "decisions" made
in plenary session appear to have been overturned by
"recommendations" emanating from a later working group.
Fair point, I guess, but I will defend the recommendations
as being more consistent with other subsequent discoveries.
Discussions about RELATION were rarely far from the surface
at Helsinki, as a corollary of the growing acceptance of
the importance of the 1:1 principle. Formally this led to
one of the three break-out groups being specifically on
RELATION, with a report on their deliberations presented
in plenary. It was agreed in plenary that RELATION work
would continue on a list.
One of the first things that came up in discussion was
that the definition of the RELATION element that had
been used up to that time - ie an ID for the related resource
- needs supplementation with information specifying the
nature of the relationship in order to make it properly useful.
Thus, any RELATION actually needs two pieces of information,
an ID and a relationship indicator. In the subsequent work
on the list, a recommended set of relationship indicators
was developed (12 in all, representing two directions of six
relationships). This is the point at which the RELATION wg
ceased its deliberation and presented its report.
I hope this all clarifies rather than confuses.
--
__________________________________________________
Dr Simon Cox - Australian Geodynamics Cooperative Research Centre
CSIRO Exploration & Mining, PO Box 437, Nedlands, WA 6009 Australia
T: +61 8 9389 8421 F: +61 8 9389 1906 [log in to unmask]
http://www.ned.dem.csiro.au/SimonCox/
|