Hello Deane,
> Hi all. I have a question about providing descriptive content
> in multiple languages for a single resource.
>
> In the Government of Canada we have the added dimension of 2
> official languages (English and French). Our core application
> profile requires that any resources in which intellectual
> content is provided in both languages must be described in
> both languages. It seems to me that we are missing a
> mechanism to group elements with content in the same language
> with that language attribute - without this capability, can
> machine understanding be guaranteed? E.g. for resources with
> content in both English and French (every home page in our
> domain) one instance of dc.language with value "eng" and the
> other with value "fre" (and a scheme specified), and 2 sets
> of metadata with parallel contents in the 2 languages, how do
> indexers "know" that elements with contents in French ARE in
> French?
Just to clarify here... In a metadata record conforming to this
application profile, one or more occurrences of the dc:language metadata
element would be used to specify the language(s) of the resource being
described - in this context, the language(s) of the items in the
collection. These occurrences of the dc:language element do not say
anything about the language of the values of elements in the metadata
record. That is signalled using syntax dependent mechanisms, but
typically for a record in XML (and RDF/XML) it's by use of the xml:lang
attribute.
So taking the example of a CLD describing a collection that contains
items in Spanish and German, and providing that metadata in English and
French....
Following the conventions of the Guidelines for Implementing DC in XML
at http://dublincore.org/documents/dc-xml-guidelines/
I imagine the record (omitting XML namespace declarations for brevity)
would look something like:
<my:cld>
<dc:identifier>http://example.org/mycoll</dc:identifier>
<dc:title xml:lang="en">My collection</dc:title>
<dc:title xml:lang="fr">Ma collection</dc:title>
<dc:description xml:lang="en">My collection is composed of...</dc:title>
<dc:description xml:lang="fr">Ma collection se compose de...</dc:title>
<!-- languages of items in collection -->
<dc:language xsi:type="dcterms:RFC3066">es</dc:language>
<dc:language xsi:type="dcterms:RFC3066">de</dc:language>
</my:cld>
And following Expressing QDC in RDF/XML
http://dublincore.org/documents/dcq-rdf-xml/, something like:
<rdf:RDF>
<rdf:Description rdf:about="http://example.org/mycoll">
<dc:title xml:lang="en">My collection</dc:title>
<dc:title xml:lang="fr">Ma collection</dc:title>
<dc:description xml:lang="en">My collection is composed of...</dc:title>
<dc:description xml:lang="fr">Ma collection se compose de...</dc:title>
<!-- languages of items in collection -->
<dc:language>
<dcterms:RFC3066>
<rdf:value>es</rdf:value>
</dcterms:RFC3066>
</dc:language>
<dc:language>
<dcterms:RFC3066>
<rdf:value>de</rdf:value>
</dcterms:RFC3066>
</dc:language>
</rdf:Description>
<rdf:RDF>
I used Spanish and German to highlight the differences between the
languages of the items in the collection and the languages of the
literal values in the metadata record but both pairs could be the same.
So I think the grouping or filtering of the metadata elements by the
language of the values would be accomplished using the values of the
xml:lang attribute (or the language identifier of the literal in the RDF
graph), not the value of the dc:language metadata element.
Does that sound about right please?
> Or perhaps this something that should be handled in
> the application layer rather than in the metadata?
Pete
|