I think there are two very useful and important requirements being discussed
here. My assertion is based in part on reading (and inferring) portions of
this thread, from various discussions with Rael and Danbri last week at DC8,
and in part my bias in being part of previous discussions in the DC
community (how this relates I'll explain in a bit) [*]
Because of this last point, however, I'm cc'ing my response to dc-general.
dc-general folk: The RSS thread can be found:
http://www.egroups.com/message/rss-dev/700?threaded=1, For context to
dc-general (errr... and to indicate to RSS at least my interpretation of the
RSS taxonomy module). The RSS Taxonomy module is trying to introduce a
simple capability to dc:subject that indicates a resource (in this case an
Item or a Channel) has a subject whose value is either a text string *or* a
resource (in this case Topic) from a controlled vocabulary. There is a bit
more to this, but again I'll get to this in a moment [*]
Ok... For the first case; indicating some resource has a subject whose value
is a literal:
For the modeling inclined:
Channel --> dc:subject --> "metadata"
and for the syntax inclinded:
<rss:channel rdf:about="http://uri-of-resource">
<dc:subject>metadata</dc:subject>
</rss:channel>
Easy enough, and indeed very useful. And in this case, no need to introduce
a taxonomy vocabulary. dc:subject I believe stisfys the semantics that your
after (the fewer vocabularies the better, eh?) From this statement, we can
begin to organize and search resources by subjects. Here the anticipated
hope here is that we collectively share the same idea about what "metadata"
means.... no? then perhaps the second option may be better.
The second case; indicating some resource has a subject whose value is a
uniquely identified topic:
For the modeling inclined:
Channel --> dc:subject --> Topic
and for the syntax inclinded:
<rss:channel rdf:about="http://uri-of-resource">
<dc:subject rdf:resource = "http://uri-of-topic-about-metadata" />
</rss:channel>
Again, easy enough... here, however, we're making it *explicitly* clear the
subject of the resource. Again, we can organize resource by subjects, but
in this case, we can do so in a clear and unambiguois way. By choosing a
naming convention (e.g. URIs) for identifying Topics we can now use a web's
notion of identify (e.g. two objects are the same if the have the same URI).
Much more useful :)
[*] Ok... here is part of the reason I'm copying in some of the DC crowd.
The Dublin Core community has had for a long time a notion of "default
value" for elements. In this simplest form, a default value is a simple
"appropriate literal" associated with a complex object. If you don't
understand any of the other semantics used for describing an object, you at
least understand the default. It's considered "good practice" in the DC
community to specify the default value. I think this is what some of the
people in RSS are trying to do as well.
For the first case above... the default value or "approprate literal" is
simply the string inside the element.
For the second case, no default value has been provided in the instance
data. There may indeed be a default value once
http://uri-of-topic-about-metadata is dereferenced. But as a helpful hint,
storing the default value in the instance data is often useful to
applications to don't want to initially harvest everything in order to begin
to process the metadata.
This is the point where <rdf:value> is useful. <rdf:value> can be thought
of as the "default value" associated with a complex resource. So given the
above second example, we would have:
The model:
Channel --> dc:subject --> Topic
Topic --> rdf:type ----> tx:Topic
Topic --> rdf:value ---> "metadata"
(if one were to indeed dereference TopicURI you might have more to this
model (e.g. how Topics relate to other Topics, etc.) but lets stick just
with the model from the metadata record).
So the syntax associated with this would be:
<rss:channel rdf:about="http://uri-of-resource">
<dc:subject>
<tx:Topic rdf:about = "http://uri-of-topic-about-metadata">
<rdf:value>metadata</rdf:value>
</tx:Topic>
</dc:subject>
</rss:chanel>
Now... I've introduced here the notion of a "tx:Topic" (which I think is in
part what the taxonomy module is trying to do). Its not a relation between
two resource but an actual resource. The utility of this construct actually
is quite useful. Even if you don't understand the semantics of what a
"tx:Topics" is, you know the default value of it is "metadata". So for
those that are processing this from a strictly syntax view, you can still
get an "appropriate literal" for dc:subject... in this case "metadata".
Now... there has recently been some discussion on saying in the instance
data, what kind of "scheme" (which I think actually means in this case the
agency that is defining the topic?) is associated with Topic being
identified. To me, this starts creeping into the area of how much
information do we really want to "cache" (or duplicate if you will) in the
instance data and how much do we want to store with the resource that is
being deferenced (in this case, the description of
http://uri-of-topic-about-metadata). The agency defining the concept? the
relationships to other concepts in the knowledge space, other alternative
spellings of the concept? etc...
This decision ultimately is left to the application developers and content
providers and in this case the people producing and consuming RSS feeds. As
such, since I'm not really an RSS developer and I don't really have a lot of
content, I'm not qualified to really make this call. However, as a simple,
small point on the graph, my work with Dublin Core applications in RDF
suggests that simply identifying the type of object it is (in this case a
tx:Topic, but one could imagine, tx:Person, tx:Place, as well) as well as a
default value (rdf:value) satisfies a tremendous amount of requirements..
And as such, I'd suggest starting with these simple vocabularies in the
taxonomy module. Deploy, gain some experience and then re-evaluate this
decision.
Hope this helps...
--
Eric Miller http://purl.oclc.org/net/eric
Senior Research Scientist mailto:[log in to unmask]
Office of Research phone:614.764.6109
OCLC Online Computer Library Center fax:614.764.2344
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|