Below is the latest revision, based largely on Simon Cox's suggestions
(which I just replied to separately). Please respond soon with any
final review comments. We would like to get declare this thing done
and ready for submission to the IESG sometime next week.
-John
=====================
Dublin Core Workshop Series S. Weibel
Internet-Draft J. Kunze
draft-kunze-dc-02.txt C. Lagoze
22 January 1998
Expires in six months
Dublin Core Metadata for Simple Resource Discovery
1. Status of this Document
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, and
its working groups. Note that other groups may also distribute working
documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as ``work in progress.''
To learn the current status of any Internet-Draft, please check the
``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast).
Distribution of this document is unlimited. Please send comments
to [log in to unmask], or to the discussion list [log in to unmask]
2. Introduction
Finding relevant information on the World Wide Web has become
increasingly problematic in proportion to the explosive growth of
networked resources. Current Web indexing evolved rapidly to fill the
demand for resource discovery tools, but that indexing, while useful,
is a poor substitute for richer varieties of resource description.
An invitational workshop held in March of 1995 brought together
librarians, digital library researchers, and text-markup specialists
to address the problem of resource discovery for networked resources.
This activity evolved into a series of related workshops and ancillary
activities that have become known collectively as the Dublin Core Metadata
Workshop Series.
The goals that motivate the Dublin Core effort are:
- Simplicity of creation and maintenance
- Commonly understood semantics
- International scope and applicability
- Extensibility
- Interoperability among collections and indexing systems
These requirements work at cross purposes to some degree, but all are
desirable goals. Much of the effort of the Workshop Series has been
directed at minimizing the tensions among these goals.
One of the primary deliverables of this effort is a set of elements
that are judged by the collective participants of these workshops
to be the core elements for cross-disciplinary resource discovery.
The term ``Dublin Core'' applies to this core of descriptive elements.
Early experience with Dublin Core deployment has made clear the need
to support additional qualification of elements for some applications.
Thus, Dublin Core elements may be expressed in simple unqualified ways
that minimal discovery and retrieval tools can use, or they may be
expressed with additional structure to support semantics-sharpening
qualifiers that minimal tools can safely ignore but that more complex
tools can employ to increase discovery precision.
The broad agreements about syntax and semantics that have emerged from
the workshop series will be expressed in a series of five Informational
RFCs, of which this document is the first. These RFCs (currently they
are Internet-Drafts) will comprise the following documents.
2.1. Dublin Core Metadata for Simple Resource Discovery
An introduction to the Dublin Core and a description of the semantics
of the 15-element Dublin Core element set without qualifiers.
This is the present document.
2.2. Encoding Dublin Core Metadata in HTML
A formal description of the convention for embedding unqualified Dublin
Core metadata in an HTML file.
2.3. Qualified Dublin Core Metadata for Simple Resource Discovery
The principles of element qualification and the semantics of Dublin Core
metadata when expressed with a recommended qualifier set known as the
Canberra Qualifiers.
2.4. Encoding Qualified Dublin Core Metadata in HTML
A formal description of the convention for embedding qualified Dublin
Core metadata in an HTML file.
2.5. Dublin Core on the Web: RDF Compliance and DC Extensions
A formal description for encoding Dublin Core metadata with qualifiers
in RDF (Resource Description Framework) compliant metadata, and how
to extend the core element set.
3. Description of Dublin Core Elements
The following is the reference definition of the Dublin Core Metadata
Element Set. The evolving reference description, including any defined
qualifiers, resides at [1]:
http://purl.org/metadata/dublin_core_elements
Note that elements have a descriptive name intended to convey a common
semantic understanding of the element. To promote global interoperability,
a number of the element descriptions suggest a controlled vocabulary for
the respective element values. It is assumed that other controlled
vocabularies will be developed for interoperability within certain local
domains. Further note that each element is optional, repeatable, and
may appear in any order with respect to other elements.
In the element descriptions below, a formal single-word label is specified
to make the syntactic specification of elements simpler for encoding schemes.
Although some environments, such as HTML, are not case-sensitive, it is
recommended best practice always to adhere to the case conventions in the
element names given below to avoid conflicts in the event that the metadata
is subsequently extracted or converted to a case-sensitive environment, such
as RDF/XML (Extensible Markup Language). A metadata element's meaning is
unaffected by whether or not the element is embedded in the resource that
it describes.
The metadata elements fall into three groups which roughly indicate the
class or scope of information stored in them: (1) elements related mainly
to the Content of a resource, (2) elements related mainly to a resource
when viewed as Intellectual Property, and (3) elements related mainly
to an Instantiation of a resource.
Content Intellectual Property Instantiation
---------------- --------------------- ---------------
3.2 Coverage 3.1 Contributor 3.4 Date
3.5 Description 3.3 Creator 3.6 Format
3.8 Language 3.9 Publisher 3.7 Identifier
3.10 Relation 3.11 Rights 3.15 Type
3.12 Source
3.13 Subject
3.14 Title
3.1. Other Contributor Label: "Contributor"
A person or organization not specified in a Creator element who
has made significant intellectual contributions to the resource
but whose contribution is secondary to any person or organization
specified in a Creator element (for example, editor, transcriber,
and illustrator).
3.2. Coverage Label: "Coverage"
The spatial and/or temporal characteristics of the intellectual content
of the resource. Any date in this element is concerned with what the
resource is about rather than when it was created or made available, the
latter belonging in the Date element. Formal specification of Coverage
is currently under development.
3.3. Author or Creator Label: "Creator"
The person or organization primarily responsible for creating
the intellectual content of the resource. For example, authors
in the case of written documents, artists, photographers,
or illustrators in the case of visual resources.
3.4. Date Label: "Date"
A date associated with the creation or availability of the resource.
Such a date is not to be confused with one belonging in the Coverage
element, which would be associated with the resource only insofar as
the intellectual content is somehow about that date. Recommended
best practice is defined in a profile of ISO 8601 [2] that includes
(among others) dates of the forms YYYY and YYYY-MM-DD. In this scheme,
for example, the date 1994-11-05 corresponds to November 5, 1994.
3.5. Description Label: "Description"
A textual description of the content of the resource, including
abstracts in the case of document-like objects or content
descriptions in the case of visual resources.
3.6. Format Label: "Format"
The data format of the resource, used to identify the software
and possibly hardware that might be needed to display or operate
the resource. For the sake of interoperability, Format should be
selected from an enumerated list that is currently under development
in the workshop series.
3.7. Resource Identifier Label: "Identifier"
A string or number used to uniquely identify the resource. Examples
for networked resources include URLs and URNs (when implemented).
Other globally-unique identifiers, such as International Standard
Book Numbers (ISBN) or other formal names are also candidates
for this element.
3.8. Language Label: "Language"
The language of the intellectual content of the resource.
Where practical, the content of this field should coincide with
RFC 1766 [3]; examples include en, de, es, fi, fr, ja, th, and zh.
3.9. Publisher Label: "Publisher"
The entity responsible for making the resource available in its
present form, such as a publishing house, a university department,
or a corporate entity.
3.10. Relation Label: "Relation"
An identifier of a second resource and its relationship to the present
resource. This element permits links between related resources (e.g.,
ancestors) and resource descriptions (see Source) to be indicated.
Examples include a translation of a work, a chapter of a book, or a
mechanical transformation of a dataset into an image. For the sake of
interoperability, relationships should be selected from an enumerated
list that is currently under development in the workshop series.
3.11. Rights Management Label: "Rights"
A rights management statement, an identifier that links to a
rights management statement, or an identifier that links a service
providing information about rights management for the resource.
3.12. Source Label: "Source"
Information about a second resource from which the present resource
is derived. While it is generally recommended that elements contain
information about the present resource only, this element may contain
a date, format, identifier, or other metadata for the second resource
when it is considered important for discovery of the present resource;
recommended best practice is to use the Relation element instead.
This element is not applicable if the present resource is in its
original form.
3.13. Subject and Keywords Label: "Subject"
The topic of the resource. Typically, subject will be expressed
as keywords or phrases that describe the subject or content of the
resource. The use of controlled vocabularies and formal
classification schemes is encouraged.
3.14. Title Label: "Title"
The name given to the resource, usually by the Creator or Publisher.
3.15. Resource Type Label: "Type"
The category of the resource, such as home page, novel, poem,
working paper, technical report, essay, dictionary. For the sake
of interoperability, Type should be selected from an enumerated
list that is currently under development in the workshop series.
4. Security Considerations
The Dublin Core element set poses no risk to computers and networks.
It poses minimal risk to searchers who obtain incorrect or private
information due to careless mapping from rich data descriptions to
simple Dublin Core scheme. No other security concerns are likely
to be raised by the element description consensus documented here.
5. References
[1] Dublin Core Metadata Element Set: Reference Description,
http://purl.org/metadata/dublin_core_elements
[2] ISO 8601 Profile for the Dublin Core,
http://purl.org/metadata/dublin_core_date_formats
[3] RFC 1766, Tags for the Identification of Languages,
http://ds.internic.net/rfc/rfc1766.txt
6. Authors' Addresses
Stuart L. Weibel
OCLC Online Computer Library Center, Inc.
Office of Research
6565 Frantz Rd.
Dublin, Ohio, 43017, USA
Email: [log in to unmask]
Voice: +1 614-764-6081
Fax: +1 614-764-2344
John A. Kunze
Center for Knowledge Management
University of California, San Francisco
530 Parnassus Ave, Box 0840
San Francisco, CA 94143-0840, USA
Email: [log in to unmask]
Voice: +1 415-502-6660
Fax: +1 415-476-4653
Carl Lagoze
Digital Library Research Group
Department of Computer Science
Cornell University
Ithaca, NY 14853, USA
Email: [log in to unmask]
Voice: +1-607-255-6046
Fax: +1-607-255-4428
|