tor 2006-03-23 klockan 23:44 +0000 skrev Rachel Heery:
> On Thu, 23 Mar 2006, Andy Powell wrote:
>
> >
> > http://dublincore.org/architecturewiki/DCPropertyDomainsRanges
> >
> > Given that this discussion is largely about semantics, we agreed at
> > todays Usage Board teleconf, that discussion about this list would move
> > into the remit of the UB. However, that doesn't mean that we aren't
> > interested in people's views. So, if you have comments on the above
> > document, please share them here.
>
> I think it would be useful to clarify to this list the underlying purpose
> of this exercise and the benefits.
That's very true, we do need to make clear to ourselves and the world
why this is important.
> Or is this being done in a spirit of
> 'enabling' unknown future benefits??
No, certainly not. The benefits are tangible and somewhat understood by
at least a few...
>
> As I understand it the aim of the exercise is to make precise distinctions
> about the "value space" of a property in machine-processable definitions.
> And that this is being done to indicate what can be inferred when a
> particular property is used in a triple.
It is not a case of trying to restrict the definitions, but rather to
make clear what the definitions *already* imply. Some of the definitions
are less than clearly formulated, for the benefit of no one... I'm
pretty sure the changes introduced are not meant to be semantic changes
at all.
The origin of these considerations is the DCMI Abstract Model, which
finally took a position on the question of "What is a 'value' anyway?".
We now know the exact distinction DCMI makes between a value and a value
string. Again, I don't believe that the DCAM really introduces anything
new, but the DCAM made it impossible to escape the issue of "values" by
blurring distinctions. Some of that blurring is present in the
definitions of the DCMI terms themselves, and we're trying to fix that.
>
> Can someone perhaps give some use cases of how this would be beneficial??
A reasonable request, again, even though it should be obvious that being
clear and consistent are virtues on their own.
It seems clear at this point that the main beneficiary at this point in
time is the Semantic Web community. The reason is simply that this
community do the most advanced machine-processing of metadata.
If metadata is crafted and consumed purely by humans, precise ranges and
domains are generally unnecessary, as the meaning is (often) clear given
context, knowledge of the domain, and a good deal of good-will.
This is not always true, mind you. A non-native english speaker not
knowing the context might have *huge* problems, for example. And there's
no way a machine can help in this case.
However, when machines generate and consume metadata we cannot assume
them to be that smart. You already know this, of course. But look at the
following:
http://swoogle.umbc.edu/index.php?option=com_frontpage&service=relation&queryType=rel_swd_instance_range_p2c&searchString=http%3A%2F%2Fpurl.org%2Fdc%2Felements%2F1.1%2Fcreator
which is a list of the Classes used for values of dc:creator (not a list
of values).
Where's the commonality? To a human, it's clear that most values (except
for some anomalies like foaf:maker) are actually entities like persons
or organizations. It's a fair guess that even the literal values denote
such entities. But how does a piece of software know that these
disparate classes denote the same *kind* of thing? The only way is
through a common definition - namely the definition of dc:creator. And
the way to tell a machine that "values of dc:creator are always Agents"
is to give it a range. That's the whole thing, really.
This information would clearly be useful for Swoogle, as it would be
able to improve its search engine based in this information.
> My concern would be that the accuracy of the values in manually crafted DC
> metadata would not support inference. DC metadata, as I see it, is
> intended to provide 'cheap and cheerful' minimal level descriptive
> metadata. Would any application want to start making inferences over such
> metadata?
All RDF data is potentially subject to inference. You won't know what
people will do with your data, and in the case of RDF, the
machine-semantics (the basis for inference) is specified in the RDF
specs themselves.
In other words: expressing metadata in RDF makes it subject to the
interpretations of *others* according to the rules of RDF. Thus, if
you're producing invalid RDF manually, you're not violating the rules of
DC but the rules of RDF. Maybe RDF shouldn't be used in those cases? Or
software should be used to make the data abide by the RDF rules?
>
> On the other hand if the intention is to be able to merge data from
> different sources on a 'partial understanding' basis I would be more
> convinced of the benefit...
Well, this is the Swoogle argument. Swoogle is just a single example of
a SW-based service that relies on the semantics of RDF, including
domains and ranges. *Any* generic RDF service would need to rely on that
information, and withholding them that information for DC properties,
even though it's there in the definitions for human consumption,
wouldn't be very meaningful.
I'm not sure the above helped at all, but that's my (incomplete) view of
the situation. Feel free to correct me! :-)
/Mikael
>
> Rachel
>
> ---------------------------------------------------------------------------
> Rachel Heery
> UKOLN, University of Bath tel: +44 (0)1225 386724
> http://www.ukoln.ac.uk
>
--
Plus ça change, plus c'est la même chose
|