There is work in progress to define an XML representation for the UK
e-Government Metadata Standard (e-GMS). The specification will probably be
out for consultation on the UK GovTalk website c. mid-August
(http://www.govtalk.gov.uk); in the meantime, it may be useful for you to
have this extract (below) from the current working draft (0.3), which
describes the design criteria and rationale.
Regards,
Ann W.
Ann M Wrightson MA MBCS
Prif Ymgynghorydd / Principal Consultant
alphaXML Cyf/Ltd
http://www.alphaxml.com
Gwasaneuthau XML: e-Lywodraeth, e-Fasnach, e-Gyhoeddi
XML services to Government and Industry
Representing e-GMS metadata in XML
DRAFT
Extract for DC & DSDL comment 18 Jul. 02
Document version 0.3
1.2 Background
The e-Government Metadata Standard is technology-independent. Amongst other
representations, e-GMS metadata will certainly occur in XML, for example,
in XML messages containing metadata, and in XML documents with embedded
metadata. e-GMS metadata in XML is likely to occur in a number of different
contexts, including:
* embedded within XML schema documents (prepared according to W3C XML
Schema Recommendation 2001)
* embedded within XML documents fulfilling specific functions, eg public
records, and reports submitted for specific regulatory purposes
* information exchanged using an XML message includes e-GMS metadata about
something outside the message
* within a dedicated metadata repository
* as a block of descriptive metadata within a wider metadata framework such
as the Metadata Encoding and Transmission Standard (METS)
* supplementary metadata attached to an existing XML document, eg metadata
created when a record is selected for long term preservation; or metadata
pertaining to the role of a pre-existing document within a set of documents
collected for a Public Inquiry.
1.3 Key design issues for XML representation of e-GMS metadata
This section discusses key design issues, and lists the design criteria for
the XML representation of e-GMS metadata arising out of the issues.
In the lists of design criteria, "e-GMS-XML" is used as a short form of "an
XML representation of general-purpose e-GMS metadata"; and "W3C
Schema-validation" for "validation according to W3C XML Schema
Recommendation 2001".
1.3.1 Long life of metadata
e-GMS metadata can be expected to be long-lived, and contribute to the
management, discovery and utilization of electronic resources over a long
life for the resource (eg >100 years for an e-archive of electronic public
records). XML is an ISO standard as well as a widely adopted industry
standard, and a successor to a very similar standard already 25 years old -
and so is very likely to be long-lived. W3C XML Schema, although a good
choice at present for schema definition within e-GIF, is less likely to be
long-lived, since there are competing schema languages for XML (which may
in future gain wider industry acceptance). In addition, an ISO XML schema
standard is under development, which is intended to encompass and harmonize
current approaches into a long-lived stable standard.
Bearing all this in mind, it is advisable for the XML representation of
general-purpose e-GMS metadata to be independent of specific features of
W3C schema-validation, whilst also being compatible with the immediate
e-GIF requirement to validate XML by this means.
XML is likely to be long-lived. However, some public sector documents have
a very long projected lifetime, and it is unlikely that XML will remain the
standard of choice for interoperability over all that time. The nature and
wide adoption of XML makes it unlikely that document content in XML will
become unusable, since XML viewing applications are likely to remain
available in the long term. However, the principal utility of metadata is
in its daily use to support integrated access to current and past
information resources, so it is quite likely that metadata in XML will
eventually become functionally obsolete. Because of this, e-GMS metadata in
XML should be easy to convert to a successor data format.
Design criteria:
* e-GMS-XML does not depend on specific features of W3C Schema-validation,
but rather uses XML structures which are likely to be straightforward to
validate using any future XML schema language
* e-GMS-XML is compatible with the immediate e-GIF requirement to validate
XML documents and messages using W3C Schema-validation
* e-GMS-XML is likely to be easy to translate into a future successor
format to XML
1.3.2 Compatibility with Dublin Core
The e-GMS metadata standard is based on Dublin Core. Standardized XML
representation of Dublin Core metadata is currently under development in
DCMI. The design criteria and principal scenarios of use for metadata are
different between DCMI and UK Government; this is already evident in the
e-GMS itself, where some aspects depart from DCMI principles. Because of
this, simple adoption of the Dublin Core XML representation for e-GMS is
unlikely to be appropriate. However, it is highly desirable that
interoperation between e-GMS metadata and generic Dublin Core metadata
should be easy to achieve - if that were not so, then the main intended
benefit of basing e-GMS on Dublin Core would be lost.
The concept of "dumb-down" use of metadata is important for
interoperability between metadata-aware applications with different
capabilities. The key point is that when any metadata processor looks at a
set of metadata, it should be able to identify and use all the metadata
elements which it can understand. In particular, refinements which it does
not understand can be ignored, and the value of an element refinement used
as if it were the unrefined element.
In general, "dumb-down" is a forgetful yet faithful metadata translation,
preserving faithfully from a more expressive metadata form all & only what
a less expressive metadata form can express. In the context of e-GMS,
"dumb-down" metadata processing is likely to have two forms: processing
metadata devised according to an e-GMS local metadata standard as if it
were generic e-GMS metadata; and processing e-GMS metadata of any kind as
if it were simple Dublin Core.
Design criteria:
* e-GMS-XML can be mapped to the Dublin Core standardized XML
representation in a straightforward manner, for those metadata elements
common to e-GMS and Dublin Core. This provides a proper "dumb-down"
metadata mapping of e-GMS to Dublin Core.
* e-GMS-XML supports "dumb-down" processing of metadata conforming to an
e-GMS local metadata standard as if it were generic e-GMS metadata, in a
uniform and straightforward manner.
1.3.3 Interdependency and more complex constraints on metadata elements
e-GMS metadata has constraints on the optionality and interdependency of
its elements, and some of these constraints are not suitable for direct
validation using W3C Schema-validation. The ISO schema standard under
development is intended to support more of this kind of functionality, but
it is not yet clear whether this will gain widespread industry support.
There are also a number of industry standards and initiatives providing
capabilities in this area. Just as for the ISO standard, the nature and
depth of industry support for these approaches in the medium term is
uncertain.
Local metadata standards based on e-GMS are likely to introduce more of
these kinds of constraints, since metadata will be used to represent data
pertaining to business rules. XML validation is principally designed to
validate the structure of an XML document, and the data type of XML element
content. However, these capabilities are often used to enforce business
rules, and it is widely seen as a virtue that XML validation should extend
as far as possible in this direction. This situation makes it difficult to
be precise about a suitable boundary between XML validation and
supplementary validation for metadata.
Design criteria:
* where e-GMS-XML requires validation over and above validation of the
structure and data type of the XML, this is simple, and specified in a
technology-independent manner
* where these more complex constraints are supported by widely used XML
technologies, then guidelines and best practice on using these should be
provided
1.3.4 Interoperability between XML metadata technologies
XML metadata is an area where there are a number of standards, and these
standards tend to be complementary rather than competing (though they may
be competitors in the context of a specific application). The picture is
made more complex by the fact that these standards come from different
domains only now converging through the ubiquity of Internet technology -
for example, there are well-regarded standards with origins in
librarianship and information science (Dublin Core), artificial
intelligence (DAML/OIL), and electronic publishing (ISO 13250 Topic Maps),
together with efforts to integrate the metadata domain in its own right
(ISO 11179, METS), as well as the ongoing work in W3C.
Although it is desirable to have a uniform XML representation of e-GMS
metadata, it is also important to enable Government organizations to choose
freely between technology solutions based on different industry standards.
This is particularly important since some Government organizations have
close ties to specific industry sectors. An important first step has been
taken by making the e-GMS standard itself technology independent.
At one extreme, fine-tuned XML representations of e-GMS metadata could be
devised for each specific context, using a range of XML metadata
technologies. However, this would lead to a large number of different
"standard" representations, and discourage easy interoperability. Another
approach would be to define a rigid "one size fits all" XML representation.
Neither of these is likely to meet the practical requirements of Government
organizations. The design criteria below are intended to offer a reasonable
middle way.
Design criteria:
* e-GMS-XML provides datatype definitions for e-GMS metadata element
values. These will be a common resource for all e-GMS XML representations.
* e-GMS-XML provides a representation designed for use in an e-GIF XML
message containing metadata about something outside the message. This is
the most general form of e-GMS metadata in XML, designed to accommodate any
(technology independent) e-GMS local metadata standard, and thus providing
a simple basis for interoperability between any e-GMS complaint systems.
* e-GMS-XML provides a representation designed to sit within the context of
an XML document. This could be within the XML data for a publication (eg a
report), or within another XML context such as a METS descriptive metadata
section.
* e-GMS-XML provides guidelines and examples for using e-GMS with selected
XML metadata technologies. The aim of these guidelines is to support, for
example, easy interoperability in RDF between e-GMS compliant systems using
RDF. These guidelines are expected to evolve over time, as specific XML
metadata technologies gain and lose acceptance in the marketplace.
* e-GMS-XML provides guidelines for designing XML representations of e-GMS
local metadata standards (it is envisaged that the e-GMS XML schema local
metadata standard will be updated to conform to these guidelines in due
course).
2 Requirements for Implementation
The utility of this specification depends on the availability of
standardized value sets and notations to provide commonly understood
meanings for the metadata element values. This requirement is a general one
for metadata. However, there is also a more precise requirement for the XML
representation of e-GMS metadata.
The standardized value sets and notations used in e-GMS metadata must have
concise names suitable for use as XML attributes. These names must be
persistent, that is, they must be as far as possible guaranteed to retain
their significance for as long as the metadata is expected to be retained
(including possible preservation as a public record).
The following are NOT suitable for use as names for value sets and
notations in this context:
* URLs or URIs, unless specifically designed for the purpose and guaranteed
by a long-lived and trusted authority
* An XML schema namespace name (it is a technology-specific surrogate for
the notation name).
It is recommended that standardized value sets and notations used in e-GMS
metadata are administered in a registry designed for long-term persistence
as a reference for understanding e-GMS metadata. The persistence and
integrity of this registry is essential for the accessibility and usability
of Government information in the long term.
3 References
1
e-Government Interoperability Framework (e-GIF) v4
http://www.govtalk.gov.uk/interoperability/egif_document.asp?docnum=534
2
e-Government Metadata Standard v1.0 April 2002
http://www.govtalk.gov.uk/interoperability/metadata_document.asp?docnum= 524
3
e-Government Local Metadata Standard for XML Schemas v1.0 May 2002
GSG paper Q2 2002
4
e-GIF XML Architecture
GSG paper Q1 2002
5
Resource Description Framework (RDF)
http://www.w3.org/RDF/
6
Government Data Standards Catalogue (GDSC), all volumes
http://www.govtalk.gov.uk/interoperability/eservices.asp?order=title
7
Metadata Encoding and Transmission Standard (METS)
http://www.loc.gov/standards/mets/
8
Dublin Core Metadata Initiative work in progress
http://www.ukoln.ac.uk/metadata/dcmi/xmlschema/
|