Print

Print


Okay... the subject is whether or not to use the LANG attribute with URL form
metadata content.

If I specify

<META NAME="DC.Description" LANG="en" CONTENT="A story about a little girl going to
visit her grandmother. The little girl encounters a big bad wolf, and ends up
chopping off his head.">

Then the META tag makes sense. It's saying "the description tag is written in
English, and it's the following..."

The following just does not make sense to me:

<META NAME="DC.Rights" LANG="en" SCHEME="URL"
CONTENT="http://www.here.com/copyright.html">

This says to me "The Rights statement is a URL that is written in English."  Which
prompts the question... is the URL really going to change depending on what language
you speak (or write)?  If you tried to translate the "words" in the URL, you'd end up
with an HTTP Error 404 - Object Not Found.

If the copyright statement is available in a number of languages, then I would
presume they would have different names. For example, an english speaking company
might have different language versions of their copyright document prepended with the
country or language code - an alternative would be to have the file named
"copyright".html, replacing the English word copyright with the local language
equivalent:

<META NAME="DC.Rights" SCHEME="URL" CONTENT="http://www.here.com/copyright.html">
<META NAME="DC.Rights" SCHEME="URL" CONTENT="http://www.here.com/copyright-fr.html">
<META NAME="DC.Rights" SCHEME="URL" CONTENT="http://www.here.com/copyright-de.html">

In theory, each of those documents would have metadata specifying the DC.Language
value (at least). The LANG attribute in this instance is totally irrelevant, since
the URL is not stored in any specific language.

A smart search/retrieval engine would check out the three references and, recognising
that you speak French, return the French version of the copyright statement (which it
found by checking the DC.Language field of the metadata of the copyright statement).

Where I *would* use the LANG attribute is in this instance:

<META NAME="DC.Rights" LANG="en" CONTENT="This document copyright 1998 SomeCo Pty.
Ltd.">
<META NAME="DC.Rights" LANG="fr" CONTENT=(French version of the above)>
<META NAME="DC.Rights" LANG="de" CONTENT=(German version of the above)>

By avoiding the use of the LANG attribute where language is not an issue, the
heuristics for an automated Meta Data cataloguer are greatly simplified - "If there
is a language attribute, then the content is the message - otherwise the content is a
pointer to the message" (of course, if the meta data explicitly states that
SCHEME="URL", then there is no decision required).

Even better, a *really* smart search/retrieval engine might look at the three
versions and, recognising that you speak only Finnish, translate the copyright
messages into Finnish for your personal consumption.

Regards,
Alex Satrapa

PS: I'm not making any claim that "SCHEME" and "LANG" are mutually exclusive. For
example:
<META NAME="DC.Creator" SCHEME="Personal - Last, First" LANG="en" CONTENT="Satrapa,
Alex">
<META NAME="DC.Contributor" SCHEME="Corporate - Name" LANG="en" CONTENT="tSA
Consulting Pty. Ltd.">
<META NAME="DC.Contributor" SCHEME="ACN" LANG="en" CONTENT="ACN 006 712 296">

That all makes sense, doesn't it?