1. General comments
I think this is an excellent start, which would be further improved by the
inclusion of qualifiers and character set issues. Neither of these are
particularly complex issues, so I hope we'll be able to make substantial
progress in Canberra. If we don't include both at an early stage, we'll
increase the risk of legacy metadata, ie legal metadata, which becomes
illegal when these issues are tackled. An obvious example is the syntax
for qualifiers. In "Proposed Encodings for Dublin Core Metadata", Dave
Beckett proposes:
Element: Author
Value : (Scheme=email)[log in to unmask]
This means that a "(" at the start of an element value must be treated
in a special way, eg replaced with "%28". That's how illegal characters
are escaped in URLs.
2. An example of why we need qualifiers
Here's an example of why we need qualifiers.
The draft says:
4.12. Language Label: LANGUAGE
Language(s) of the intellectual content of the resource. Where
practical, the content of this field should coincide with the
NISO Z39.53 three character codes for written languages.
Though this scheme may be widespread in the (US?) library community,
language labeling on the Internet doesn't use this scheme, but rather
that of ISO 639, together with ISO 3166. The application of these
standards to the Internet is specified by RFC 1766 and RFC 2070.
3. Encoding metadata in HTML
A minor issue concerning the encoding of metadata in Web pages, is the
treatment of the character QUOTATION MARK. The draft should say that this
character, if present in an element value, must be escaped using HTML entity
names or numeric character references.
As both these mechanisms rely on the character "&", this character itself
must be escaped, if present in an element value.
Furthermore, though the closing ">" is outside the delimiting quotes, HTML
2.0 (RFC 1866) says:
NOTE - Some historical implementations consider any occurrence of the '>'
character to signal the end of a tag. For compatibility with such
implementations, when '>' appears in an attribute value, it should be
represented with a numeric character reference. For example, '<IMG
SRC="eq1.jpg" alt="a>b">' should be written '<IMG SRC="eq1.jpg"
alt="a>b">' or '<IMG SRC="eq1.jpg" alt="a>b">'.
This gives us:
Character Entity name Numeric character reference
" " "
& & &
> > >
Misha
|