Andrew said:
> It's not clear to me why you would want to invent a new
> namespace for the definition of dctag when it was related to
> but not purporting to be dc:subject. This seems similar to
> what is done with encoding schemes, e.g., dct:DDC, dct:LCC,
> dct:LCSH, dct:MESH, dct:NLM, dct:TGN and dct:UDC. Taken from
> "Expressing Qualified Dublin Core in RDF /XML" document:
>
> <dc:subject>
> <dcterms:MESH>
> <rdf:value>D08.586.682.075.400</rdf:value>
> <rdfs:label>Formate Dehydrogenase</rdfs:label>
> </dcterms:MESH>
> </dc:subject>
>
> It seems like you could just define a new encoding scheme,
> e.g., dcterm:TAG, to handle the semantics of social tagging.
> However, that might not be enough. Organizations such as
> Flickr, YouTube, etc. may desire slightly different semantics
> for their social tagging. DCMI probably doesn't want to keep
> defining new encoding schemes on a regular basis.
Leaving aside the "which namespace do we use" issue for a second, I
think we need to be careful not to confuse two very different types of
thing, two different types of term used in DC metadata: properties and
vocabulary encoding schemes.
A property is a specific type of relationship. The dc:subject property
is one specific type of relationship, defined and named by DCMI (using a
DCMI-owned URI) and described by DCMI in human-readable terms as "The
topic of the content of the resource."
A vocabulary encoding scheme, on the other hand, is something quite
different. According to the DCMI Abstract Model, it is a class of which
the value is an instance. N.B. This is one of the areas where a change
is susggested in the proposed revisions to the DCAM - the suggestion is
that we change the concept of VES to something like "an enumerated set
of which the value is a member", and that is _not_ represented as an
instance/class relationship, but for the purposes of this discussion I
don't think that matters too much. The point is that a VES is a
different thing from a property and "plays a different role" in DC
metadata.
So, when I use the dc:subject property in an RDF triple or a DC
statement, I'm making an assertion that
resource:A has-as-topic resource:B
Or maybe more colloquially
resource:A is-about resource:B
I could specify that resource:B is an instance/member of dcterms:LCSH or
an instance/member of dcterms:DDC (i.e. I could specify a vocabulary
encoding scheme for the value). That provides some additional
information about the value - it's an instance/member of some specified
class/set - but that doesn't change the nature of the relationship that
I'm asserting between resource:A and resource:B. The property referred
to in my triple/statement is still the same: the dc:subject property.
I'm still asserting a "has-topic"/"is-about" relationship.
If resource:B is a tag, and I use it as the object in an RDF triple or a
DC statement with the dc:subject property, then I'm making an assertion
that
resource:A has-as-topic tag:T
Or
resource:A is-about tag:T
I could specify that tag:T is an instance/member of petej:TagSet (i.e. I
could specify a vocabulary encoding scheme for that value), but - as in
the example above of LCSH and DDC - adding the vocabulary encoding
scheme provides additional information about the value, but it does not
change the assertion I am making about the nature of the relationship
between resource:A and tag:T. It's the property which specifies the
nature of the relationship, and as long as I'm using the dc:subject
property, I'm asserting a "has-topic"/"is-about" relationship.
And in my previous message, I was arguing that when people "tag"
resources, yes, they are asserting a relationship between the tagged
resource and a tag (but see also note below), but it is _not_ true that
the relationship they are asserting is always a "has-topic"/"is-about"
relationship. On the contrary, people use tagging to represent all sorts
of relationships - ownership, status, "rating", related-location. A
resource tagged "to-read" on del.icio.us isn't "about" the concept of
not having been read yet. Well, yes, I accept that somewhere out there
someone has written a weblog post describing the pile of paperbacks on
their bedside table and a del.icio.us user has indeed tagged it as
"to-read" with that notion in mind, but in the vast majority of cases
that isn't the case! ;-)
So representing all tagged-resource/tag relationships as statements
using the dc:subject property not only fails to capture the particular
relationship that someone had in mind when they tagged a resource, but
asserts a relationship which - in many cases - the tagger did not
intend.
(Actually, I should have highlighted yesterday that del.icio.us doesn't
only represent tags using dc:subject, it also uses the property
http://purl.org/rss/1.0/modules/taxonomy/topics from the RSS taxonomy
module. But I'd suggest that the same issue arises. Tagging is used with
intent other than to indicate a has-topic relationship.)
So, in the general case, a property other than dc:subject (or
taxo:topic/taxo:topics) would be required. You could argue that the
dc:relation property does the job - there is some unspecified type of
relationship between the resource and the tag - or you could argue for
a more specific "is-associated-with-tag" or "is-tagged-with" property.
I'd argue against putting "subject" in the name/URI because I think we
want to avoid suggesting (even to a human reader) any relationship with
the dc:subject property.
And indeed the ontology I referred to yesterday provides such a property
http://www.holygoat.co.uk/owl/redwood/0.1/tags/taggedWithTag
Described as "Indicates that the subject has been tagged with the object
tag. This does not assert by who, when, or why the tagging occurred. For
that information, use a reified Tagging resource."
So we can say
resource:A tags:taggedWithTag tag:T
The final part of that description is what I was referring to in my "but
see also note below" above. Depending on what information it is
desirable/necessary/useful to capture about the "tagging", then you may
wish to adopt the approach of describing that "event" in more detail. If
I understand it correctly, the ontology supports both the simple
resource:A tags:taggedWithTag tag:T
approach, and it also supports a richer, more complex approach which
seeks to represent more of the "context", particularly the agent who
performed it and the point in time they did so, by representing a
"tagging event" as a resource ("reifying the tagging").
See http://www.holygoat.co.uk/projects/tags/ for more discussion,
examples.
> However, DCMI would not need to define new encoding schemes
> on a regular basis since the above qualified Dublin Core
> really boils down to:
>
> <dc:subject>
> <rdf:Description>
> <rdf:type rdf:resource="http://purl.org/dc/terms/MESH"/>
> <rdf:value>D08.586.682.075.400</rdf:value>
> <rdfs:label>Formate Dehydrogenase</rdfs:label>
> </rdf:Description>
> </dc:subject>
>
> Which generates the same exact RDF triples. This implies
> that anybody can create new encoding schemes and semantics,
> albeit not in the dcterms: namespace since it is controlled
> by DCMI, and still be compatible with the DCMI model. So if
> Flickr wanted to define their definition of tags they could just do:
>
> <dc:subject>
> <rdf:Description>
> <rdf:type rdf:resource="http://www.flickr.com/photos/tags/"/>
> <rdf:value>D08.586.682.075.400</rdf:value>
> <rdfs:label>Formate Dehydrogenase</rdfs:label>
> </rdf:Description>
> </dc:subject>
>
> Which would provide interoperability with Dublin Core without
> DCMI lifting a finger. Internally at OCLC, for research
> projects, we have been using this interoperability practice
> for defining new encoding schemes to controlled vocabularies
> that DCMI has not defined.
> For example, GSAFD, NGL, RVM, etc.
Oh, yes, in terms of the name/URI for the term, I quite agree that
there's no requirement that a DCMI-owned URI is assigned. I don't mind
whether it's a DCMI-owned URI or a URI owned by another agency (as long
as it's an agency I trust to (a) manage their URIs sensibly so as to
ensure a "reasonable" degree of persistence and (b) provide consistent
representations of the identified resources in a way which makes those
representations accessible to my tools using simple widely-deployed
mechanisms in accordance with W3C guidelines.)
But (IMHO) the requirement would not be satisfied by coining a new
vocabulary encoding scheme, whether that scheme was identified by a
DCMI-owned URI or a URI owned by another agency.
Pete
|