Hi all,
I like how this update focuses the document. Though I agree with Jon that I'm not sure I've interpretted parts of it correctly without examples to compare against. I guess my comments below are partly "devil's advocate" to test out parts of the model. I realise some of my comments relate to parts that haven't changed, it's just that I only noticed them this time...
1. "A description is made up of... zero or one resource URI" - I would think it would be valid for a single resource to have multiple identifiers, especially for physical resources?
2. "A statement... is made up of... zero or one value URI" - Similiarly for values there could be multiple URIs, especially since the model states a value may be a member of more than one vocabulary (and so is very likely to have multiple URIs)?
3. I can now know how to decode a rich representation's octets by looking at the media type, but if there are multiple rich representations I can't differentiate between them (which is possible with multiple value strings). Value strings can be differentiated by language or syntax encoding, but this is not available for rich representations (eg. different JPEGs/videos for multiple languages, XML in different schemas/DTDs). This seems inconsistent. (I think I may have raised this before?)
4. Related to (3) above, it is not stated whether a value string or rich representation is also a resource (or not), and so can be described.
5. "Each resource may be a member of one or more vocabulary encoding schemes" - I feel a little bit uneasy about this statement. While it is certainly true (eg. if one vocabulary includes terms from another), we would need to be very careful in stating a particular resource/term is included in multiple vocabularies as each occurance may appear to be the same but in fact be subtley different. For example the concept "New Zealand" may occur in both ISO3166 and Getty's Thesaurus of Geographic Names, but upon closer inspection we may discover their definitions of what "New Zealand" includes differs (such as including different outlying islands) [NB: I don't know whether that is true] and so in fact it doesn't occur in two vocabularies as there are actually two different concepts/terms. I realise how people interpret this is outside the scope of the model, but is likely to be a common trap (it reminds me of the issue of thinking XML namespaces can be re-used as RDF properties).
6. Section 5 "DCMI Abstract Model semantics" - I'm assuming this section is to provide guidance on the model's relationship to RDF? Key things I'd want to know are:
a). What does-not/will-not map to RDF (the table only shows what _does_ map)
b). Can this model be used directly in the RDF world, ie. if I use this will it break in an RDF environment?
Where I'm coming from: I don't confess to know a lot about the RDF world, but it seems to me that a generic RDF application will have to at least have a base set of properties it understands - it is possible to discover what any unknown resource or property is by following defined RDFS or OWL relationships, but ultimately your application ends up at a text description that will probably need a human to interpret (either through providing realtime feedback to the text descriptions or by programming support into the application's code for future encounters). The DCMI Terms set seems to me to be a good base that all applications could be coded to understand, then all properties that sub-property off them at some level could be understood. So I'm just wondering if our abstract model is sufficiently robust to play that role?
7. Terminology "description set" - suggest "A set of one or more descriptions [insert: about one or more resources]."
8. Terminology "has domain" - suggest add to end: "It may useful to think of it as the opposite to "class (of the described resource) has property"." - paraphrased from the RDF Schema introduction.
9. Terminology "has range" - I wonder if including the described resource is necessary. Might it be simpler to say: "A relationship between a property and a class which indicates that if a property/value pair contains that property, then it follows that the value in the property/value pair is an instance of that property's related class."? While longer, this seems to be more direct about what is related. Or is this better?: "A relationship between a property and a class which indicates that the value in a property/value pair is an instance of that property's related class."
10. Terminology "property/value pair" - suggest "The combination of a property and a value, used to describe [insert: one property of] a resource."
11. Terminology "syntax encoding scheme" - It is not clear what the difference is between a set of strings that have been enumerated from some syntax rules (such as ISO 3166) and a vocabulary encoding scheme?
12. "Resource URI" is not in the terminology section.
13. Section 2 "DCMI Abstract Model" - I'd like to suggest this has three sub-sections to improve readability:
- 2.1 Resources
- 2.2 Descriptions
- 2.3 Vocabularies
Because these are currently all in one section I found I didn't realise I was moving into a new area/view. This is particularly so with the vocabularies part which offers a different/wider view. I don't think the notes at the end belong as part of the "Abstract Model" section and could sit in another section (as sections 3 and 4 are also essentially notes or a continuation, eg. section 3 is the first mention of "record" even though it appears in the UML diagram).
Thanx,
Douglas
|