Hi there,
I am looking for some advice on best practice of refining/extending Dublin Core, specifically we are constrained by XML Schema and so RDF options are not possible. My position here is to create an XML Schema that holds metadata about resources and processes. I feel that it is important to reuse existing standards where possible rather than inventing the wheel again, as such I am looking to the DCMI Metadata Terms.
Some existing DCMI Metadata Terms are a natural fit for this, e.g. title, language, publisher. However, others lack the precision semantics that we need, I have identified two possible ways in which we could incorporate this precision with Dublin Core, but am unsure of the best approach and would appreciate some advice from the Dublin Core perspective.
Consider for example dcterms:spatial, a simple literal value is not enough for us, we need a complex addressing mechanism -
Approach 1, make use of xsi:type by extending dc:SimpleLiteral -
Example XML -
<dcterms:spatial xsi:type="m:address">
<m:address-line>102</m:address-line>
<m:address-line>Some Road</m:address-line>
<m:settlement xsi:type="dcterms:TGN">York</m:settlement>
<m:county>Yorkshire</m:county>
<m:country>GB</m:country>
</dcterms:spatial>
Example XML Schema snippet for 'm' -
<xs:complexType name="address">
<xs:annotation>
<xs:documentation xml:lang="eng">Restriction of the spatial coverage for use in Dublin Core's dc:spatial e.g. The geographical area/address represented</xs:documentation>
</xs:annotation>
<xs:complexContent mixed="true">
<xs:extension base="dc:SimpleLiteral">
<xs:sequence>
<xs:choice maxOccurs="unbounded">
<xs:element name="address-line" type="xs:string" maxOccurs="unbounded"/>
<xs:element name="settlement" type="xs:string">
<xs:annotation>
<xs:documentation>Should consider restricting values to a controlled vocabularly such as Getty TGN through xsi:type="dcterms:TGN"</xs:documentation>
<xs:appinfo>
<dcterms:hasFormat>http://purl.org/dc/terms/#ves-TGN</dcterms:hasFormat>
</xs:appinfo>
</xs:annotation>
</xs:element>
<xs:element name="county" type="xs:string">
<xs:annotation>
<xs:documentation>Should consider restricting values to a controlled vocabularly such as Getty TGN through xsi:type="dcterms:TGN"</xs:documentation>
<xs:appinfo>
<dcterms:hasFormat>http://purl.org/dc/terms/#ves-TGN</dcterms:hasFormat>
</xs:appinfo>
</xs:annotation>
</xs:element>
<xs:element name="country" type="stl:ISO3166CountyCode"/>
<xs:element name="postal-code" type="xs:string"/>
</xs:choice>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
Approach 2, make use of substitution groups -
Example XML -
<m:address>
<m:address-line>102</m:address-line>
<m:address-line>Some Road</m:address-line>
<m:settlement>York</m:settlement>
<m:county>Yorkshire</m:county>
<m:country>GB</m:country>
</m:address>
Example XML Schema for 'm' -
<xs:element name="address" type="addressType" substitutionGroup="dcterms:spatial"/>
<xs:complexType name="addressType">
<xs:sequence>
<xs:choice maxOccurs="unbounded">
<xs:element name="address-line" type="xs:string" maxOccurs="unbounded"/>
<xs:element name="settlement" type="xs:string">
<xs:annotation>
<xs:documentation>Should consider restricting values to a controlled vocabularly such as Getty TGN through xsi:type="dcterms:TGN"</xs:documentation>
<xs:appinfo>
<dcterms:hasFormat>http://purl.org/dc/terms/#ves-TGN</dcterms:hasFormat>
</xs:appinfo>
</xs:annotation>
</xs:element>
<xs:element name="county" type="xs:string">
<xs:annotation>
<xs:documentation>Should consider restricting values to a controlled vocabularly such as Getty TGN through xsi:type="dcterms:TGN"</xs:documentation>
<xs:appinfo>
<dcterms:hasFormat>http://purl.org/dc/terms/#ves-TGN</dcterms:hasFormat>
</xs:appinfo>
</xs:annotation>
</xs:element>
<xs:element name="country" type="stl:ISO3166CountyCode"/>
<xs:element name="postal-code" type="xs:string"/>
</xs:choice>
</xs:sequence>
</xs:complexType>
My feeling is that the first approach is the preferred mechanism as when you look at or process the metadata it is absolutely explicit that this is a dcterms:spatial, and if you received this document as a 3rd party then as a minimum you were just able to extract the text nodes, you would have a strict dcterms compliant spatial property.
One downside to this is that we have to sprinkle xsi:type attribute annotations into our XML instances. However there is also an advantage to this, which is that in future if we have values that we have non-anticipated we can simply create new complexTypes for refinement, or even default to the lowest common denominator i.e. dc:SimpleLiteral which is just an xs:string.
I would appreciate thoughts on this from the Dublin Core perspective, how have others done this in the past, what is the recommended best practice?
Thanks Adam
Please don't print this e-mail unless you really need to.
---------------------------------------------------------------------------------
National Archives Disclaimer
This email message (and attachments) may contain information that is confidential to The National Archives. If you are not the intended recipient you cannot use, distribute or copy the message
or attachments. In such a case, please notify the sender by return email immediately and erase all copies of the message and attachments. Opinions, conclusions and other information in this message
and attachments that do not relate to the official business of The National Archives are neither given nor endorsed by it.
------------------------------------------------------------------------------------
|