Since there's been no Meta2 mail today so far, I thought I'd chuck out
this first stab at RFC 4 for comment; many thanks to Misha for the
helpful comments yesterday.
Particularly, can anyone think of a non-contentious subelement to use as
an example in the main definition? I thought Date.Created was pretty
safe, but apparently it isn't.
cheers,
T.
----
Dublin Core Workshop Series T. Gill
Internet-Draft P. Miller
draft-gill-dc-00.txt
20 February 1998
Expires in six months
Encoding Qualified Dublin Core Metadata in HTML
1. Status of this Document
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or rendered obsolete by other documents
at any time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
To learn the current status of any Internet-Draft, please check the
"1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast).
Distribution of this document is unlimited. Please send comments to
[log in to unmask], or to the discussion list [log in to unmask]
2. Introduction
The Dublin Metadata Core Element Set, or "Dublin Core", is a set of
15 information elements used to provide a simple means by which
networked electronic resources may be described for more effective
discovery and retrieval.
The fifteen elements of the Dublin Core are TITLE, CREATOR, SUBJECT,
DESCRIPTION, PUBLISHER, CONTRIBUTOR, DATE, TYPE, FORMAT, IDENTIFIER,
SOURCE, LANGUAGE, RELATION, COVERAGE and RIGHTS. The elements are
described more fully in RFC [DC-RFC1], and on the official Dublin
Core web site [1].
These 15 elements and their meanings have been developed and refined
by an international group of librarians, information and subject
specialists through a consensus-building process that has included 5
international workshops to date and discussion on the active Meta2
mailing list [2].
During the workshop series, a significant body of the participant
community agreed on the need for a simple, optional method for
additional semantic qualification of Dublin Core metadata. The method
agreed upon is the use of the Canberra Qualifiers:
* Subelement, which allows the refinement and clarification of an
element's content.
* Scheme, which allows an element's value to be identified as part
of an existing classification system, coding scheme, glossary or
thesaurus.
* Language, which specifies the language of a metadata element's
content.
This document describes two syntaxes for encoding Dublin Core
metadata qualified using the Canberra Qualifiers in HTML; the first
is compliant with the HTML DTD (Document Type Definition) from
version 4.0 [3] onwards, following a successful proposal from the
Dublin Core community to the World Wide Web Consortium in 1997 to
enhance the functionality of the <META> tag. The second version of
the syntax is compliant with the HTML DTD from version 2.0 onwards,
and has been included for completeness since some early Dublin Core
implementations used it; however, it is now seen by the Dublin Core
community as sub-optimal.
The broad agreements on syntax and semantics that have emerged from
the workshop series will be expressed in a series of five
Informational RFCs, of which this document is the fourth. These RFCs
(currently Internet-Drafts) will comprise the following documents:
2.1 Dublin Core Metadata for Simple Resource Discovery
Editors: John Kunze and Carl Lagoze
An introduction to the Dublin Core and a description of the intended
semantics of the 15-element Dublin Core element set without
qualifiers.
2.2 Encoding Dublin Core Metadata in HTML
Editors: John Kunze and Carl Lagoze
A formal description of the convention for embedding unqualified
Dublin Core metadata in HTML.
2.3 Qualified Dublin Core Metadata for Simple Resource Discovery
Editors: Paul Miller and Tony Gill
The principles of element qualification and the semantics of Dublin
Core metadata when expressed with a recommended qualifier set known
as the Canberra Qualifiers.
2.4 Encoding Qualified Dublin Core Metadata in HTML
Editors: Tony Gill and Paul Miller
A formal description of the convention for embedding qualified Dublin
Core metadata in HTML.
2.5 Dublin Core on the Web: RDF Compliance and DC Extensions
Editors: to be determined
A formal description for encoding Dublin Core metadata with
qualifiers in RDF (Resource Description Framework) [4] compliant
metadata, and how to extend the core element set.
3. Encoding Unqualified Dublin Core Metadata in HTML - A Brief Overview
The syntax used for embedding unqualified Dublin Core metadata (i.e.
without the use of the Canberra Qualifiers) within HTML is described
formally in RFC [DC-RFC2], but an overview of this syntax has been
provided here for completeness. The syntax described below is
compliant with the HTML DTD from version 2.0 onwards.
To encode Dublin Core metadata in an HTML document, the <META> tag is
used between the <HEAD> and </HEAD> tags using the following syntax:
<META NAME = "DC.ElementName" CONTENT = "Value">
In the formulation above, 'ElementName' and 'Value' are placeholders
for one of the 15 element labels and its value respectively. For
example;
<META NAME = "DC.Creator" CONTENT = "William Shakespeare">
4. Using the Canberra Qualifiers in HTML
Although simplicity is one of the fundamental tenets of the Dublin
Core metadata initiative, a significant body of the stakeholder
community believes that the option to refine the semantics of the
element set through the addition of qualifiers is essential to allow
effective deployment of the Dublin Core for resource discovery.
For example, a value of 759.2 in the SUBJECT element is meaningless
unless the value is qualified as a reference to the Dewey Decimal
Classification system, allowing the meaning (British Painting) to be
correctly interpreted. Similarly, the elements DATE and COVERAGE
often have little meaning or value without qualification of some
form; for example, the date 06-11-97 could refer to the sixth day of
November or the eleventh day of June, and the century is clearly
ambiguous. The international nature of information on the web can
also have a critical impact on the usefulness of metadata for
resource discovery; the word 'chat' in has a very different meanings
in French and English, for example.
To address these shortcomings, three types of element qualifiers were
proposed at the 4th Dublin Core workshop in Canberra, Australia.
These 'Canberra Qualifiers' are:
* Subelement, which allows the refinement and clarification of an
element's content <urgently need a non-contentious example
here!>.
* Scheme, which allows an element's value to be identified as part
of an existing classification system, coding scheme, glossary or
thesaurus, such as the Dewey Decimal Classification or the Art &
Architecture Thesaurus.
* Language, which specifies the language of a metadata element's
content [5].
By optionally applying any or all of these qualifiers to the 15
elements of the Dublin Core, more detailed and semantically specific
information about a resource can be encoded, thus assisting precision
in the discovery and retrieval process for systems that support the
use of these qualifiers.
These qualifiers, their semantics, and guidelines for their
application, are discussed in more detail in RFC [DC-RFC3]; however,
the most important restriction on their use is worth restating:
"qualifiers may be used to refine meaning within an element, but
not to extend any element".
Adherence to this simple principle is important, because it means
that the core metadata values can be understood correctly, even if
the qualifiers are disregarded. This will enable the construction of
simple resource discovery systems that correctly parse qualified
Dublin Core metadata at the simple, unqualified semantic level.
The application of qualifiers that conform to this stipulation,
encoded with a consistent syntax, represents an effective compromise
that allows optional refinement of the 15 Dublin Core elements,
without sacrificing the simplicity and interoperability that is a
guiding philosophy of the Dublin Core initiative.
Two alternative forms of the syntax for encoding qualified Dublin
Core metadata are presented here:
* An HTML 4.0 syntax. This syntax represents the approach preferred
by the Dublin Core community.
* An HTML 2.0 syntax.
N.B. Caution should be exercised if using non-HTML 4.0 aware tools to
create qualified Dublin Core metadata with the HTML Version 4.0
syntax; early experiences suggest that loss of metadata may occur
with some editors.
4.1 HTML 4.0 Syntax
The following syntax should be used for encoding qualified Dublin
Core metadata in HTML.
<META NAME="DC.ElementName.SubElement"
SCHEME="SchemeValue"
LANG="LangValue"
CONTENT="ContentValue">
The formulation above contains the following placeholders:
'ElementName' The name of one of the 15 Dublin Core elements.
'SubElement' The name of a subelement that refines, but does not
extend, the scope of the element to which it is
appended. A working group of the Dublin Core is in
the process of creating a canonical list of
commonly-requested subelements. Further details of
the method by which subelements may be created and
used are provided in RFC [DC-RFC3].
'SchemeValue' An identifier for a formal scheme, such as a
classification scheme or vocabulary authority.
'LangValue' An identifier for the language in which the metadata
is encoded. This should be recorded according to the
instructions given in the currently prevailing HTML
specification.
'ContentValue' The content of the qualified Dublin Core metadata
element.
All of the qualifiers and their associated values are optional, and
should be omitted when their use is either inappropriate or creates
ambiguity.
4.2 Example 1
This example shows the refinement of the 'DATE' element with a
'CREATED' subelement [6], and the use of the SCHEME qualifier to
identify that the content value of the metadata has been encoded
according to an ISO standard for formatting dates. The use of a
profile of ISO 8601, as described by theW3C Note 'Date and Time
Formats' [7], is highly recommended by the Dublin Core community for
recording temporal information unambiguously.
<META NAME = "DC.Date.Created"
SCHEME = "ISO8601"
CONTENT = "1997-03-07">
4.3 Example 2
This example utilizes a multi-part subelement [6] to enable the
COVERAGE element to describe the minimum x value of an area that is
described by the Ordnance Survey of Great Britain coordinate system.
The SCHEME qualifier is essential in order to identify the coordinate
system from which the metadata value is taken from.
<META NAME = "DC.Coverage.X.Min"
SCHEME = "OSGB"
CONTENT = "466342">
4.4 Example 3
This example does not employ the use of any subelements to the
SUBJECT element, but uses the SCHEME qualifier to identify the value
of the metadata as a term from the Getty Information Institute's Art
& Architecture Thesaurus, and uses the Language qualifier to identify
the language of the metadata as American English. The value of the
Language qualifier has been encoded according to RFC 1766, using a
two-letter language code followed by a two-letter country code (en-US
identifies American English).
<META NAME = "DC.Subject"
SCHEME = "AAT"
LANG = "en-US"
CONTENT = "emblems (allegorical pictures)">
4.5 HTML 2.0 syntax
The following syntax can be used for encoding qualified Dublin Core
metadata in HTML that is compliant with version 2.0 and upwards of
the HTML DTD. Although the Subelement is handled in the same way as
before, the values for Scheme and Language must be embedded within
the metadata element's CONTENT, because the use of the SCHEME and
LANG attributes was not supported in HTML prior to version 4.0.
<META NAME = "DC.ElementName.SubElement" CONTENT = "
(SCHEME=SchemeValue) (LANG=LangValue) ContentValue">
The formulation above contains the same placeholders as the previous
syntax formula. For example:
<META NAME = "DC.Subject" CONTENT = "(SCHEME = AAT)(LANG = en-US)
emblems (allegorical pictures)">
Or:
<META NAME = "DC.Coverage.X.Min" CONTENT = "(SCHEME = OSGB) 466342">
5. Creating references to scheme information
The application of formal schemes and standards for recording
metadata can increase search precision, and is strongly encouraged by
the Dublin Core community. In order for the use of such schemes to be
of most value, however, it is helpful to be able to provide links to
sources of further information about the scheme. [8]
This is also true of the Dublin Core metadata scheme itself; the
recommended method for identifying the metadata scheme used as Dublin
Core is to employ the 'profile' attribute of the <HEAD> element,
introduced in version 4.0 of the HTML specification, to point to the
Dublin Core reference site on the WWW:
<HEAD profile="http://purl.org/metadata/dublin_core">
The <LINK> tag can also be used to provide optional links to formal
scheme descriptions [9], although it is not possible to explicitly
identify the individual metadata element value which utilizes any
linked scheme(s). The <LINK> tag follows the syntax:
<LINK REL=SCHEMA.Scheme HREF="SchemeURI">
where 'Scheme' is a placeholder for a short scheme identifier, and
'SchemeURI' is the address for further information about the scheme.
For example,
<LINK REL="SCHEMA.imt"
HREF="http://sunsite.auc.dk/RFC/rfc/rfc1521.html">
provides a <LINK> to RFC 1521 and the definitions of Internet Media
Types, which might be used in conjunction with the FORMAT element
when using Internet Media Types for content values [10].
This mechanism also enables links to be made to the individual Dublin
Core element definitions, in situations where it is not possible to
utilize the profile attribute of the <HEAD> element - for example, if
compliance with an HTML specification prior to version 4.0 is
required.
For example:
<LINK REL="SCHEMA.dc"
HREF="http://purl.org/metadata/dublin_core_elements#creator">
6. Security Considerations
The Dublin Core element set poses no risk to computers and networks.
It poses minimal risk to searchers who obtain incorrect or private
information due to careless mapping from rich data descriptions to
the simple Dublin Core scheme. No other security concerns are likely
to be raised by the element description consensus documented here.
7. References
[1] Dublin Core,
http://purl.org/metadata/dublin_core
[2] Meta2 mailing list archive,
http://weeble.lut.ac.uk/lists/meta2/
[3] HTML 4.0 Specification,
http://www.w3.org/TR/REC-html40/
[4] Resource Description Framework (RDF) Model and Syntax,
http://www.w3.org/TR/WD-rdf-syntax
[5] The value of the Language qualifier should be specified according
to the guidelines given in the prevailing version of the HTML
specification. The syntax described here does not support the
identification of multiple languages within the content of a single
metadata element value; as this facility has been identified as a
requirement, future metadata architectures should attempt to provide
this capability.
[6] As the Dublin Core community has yet to reach consensus regarding
the most commonly-required subelements, this Subelement should be
regarded as a fictitious example used solely for demonstrating the
syntax.
[7] Date and Time Formats,
http://www.w3.org/TR/NOTE-datetime
[8] This information may be provided in either a
machine-understandable form, or may be descriptive information
designed to be read by people.
[9] A problem with the current implementation of <LINK> is that it
may only be used to reference online resources. Many standards, such
as those from national and international standards agencies, are not
available online. An extension to the functionality of the <LINK>
element, such that the existing HREF to an online resource might be
complemented by a REFERENCE to a paper document, would be beneficial
to some communities.
[10] Internet Media Types
ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/media-types
8. Authors' Addresses
Tony Gill
Surrey Institute of Art & Design
Falkner Road
Farnham, Surrey, GU9 7DS, UK
Email: [log in to unmask]
Voice: +44 1252 722441
Fax: +44 1252 712925
Paul Miller
Archaeology Data Service
King's Manor
York, YO1 2EP, UK
Email: [log in to unmask]
Voice: +44 1904 43 3954
Fax: +44 1904 43 3939
--
Cheers,
T.
-- Tony Gill ------------------------ ADAM & VADS Programme Leader --
Surrey Institute of Art & Design * Farnham * Surrey * GU9 7DS * UK
Tel: +44 (0)1252 892721 * Fax: +44 (0)1252 892725
-- [log in to unmask] -- http://adam.ac.uk/ -- http://vads.ahds.ac.uk/ --
|