Recent messages to this list (meta2) pertaining to Z39.50 mention the
"bib-1" attribute set, DC mappings to Z39.50 and even "integration of
Dublin Core into Z39.50" (though I'm sure there isn't any common
understanding of what that means!).
>From: [log in to unmask] (Kevin C. Marsh)
>
>I agree that integration of Dublin Core into Z39.50 is critical. As the DC
>syntax begins to settle down and become standardized (It is becoming
>standardized, isn't it? Please? ) it will be necessary to define a standard
>mapping between ALL elements.qualifiers and Bib-1 attributes.
So I wish to describe (at a high level) some of the Z39.50/DC issues
pertaining to searching, based on recent discussion among Z39.50
implementors. This is not a proposal, just an attempt to describe some of
the issues, and I do not wish to represent this as any sort of consensus. But I
hope these issues will be considered during the discussion of Z39.50 at the
meeting next week (which unfortunately I cannot attend).
Background
----------
Most people that have casual familiarity with Z39.50 know of the "bib-1"
attribute set. Attribute sets in Z39.50 are developed to support
searching; attributes describe search terms. Theoretically, different
attribute sets are developed for different applications or disciplines.
The bib-1 attribute set was originally developed to support bibliographic
searching (i.e. searching bibliographic databases). Z39.50 originally was
developed with a strong bibliographic bias, and so for a long while, bib-1
was the only existing attribute set; thus whenever an implementor wanted
a new search access point ("Use" attribute in Z39.50 terminology), he/she
would propose it for addition to bib-1, since that was the only attribute
set supported. Thus many attributes were added to bib-1 that were not
necessarily, strictly speaking, bibliographic, for example "date of last
update".
And so, bib-1 came to be though of (by many, but not by everyone) as
more than just for bibliographic databases. In particular, as new
discipline specific attribute sets were being developed, bib-1 came to be
though of as the general or utility attribute set. However it must be
stressed that there are implementors who are offended by this view and
still think of bib-1 as the bibliographic set. Nevertheless, bib-1 has
become bloated by years of expansion, and everyone seems to agree that it
could well use a major overhaul.
It should also be noted that "bibliographic information" is not
synonymous with "metadata", at least, not in most Z39.50 implementors'
views, and particularly, not in the views of those most bibliographically
biased -- on the other hand, it seems to me that people in other
disciplines (e.g. museum information) tend to be more inclined to view
bibliographic information as more or less synonymous with metadata, and
therefore view bib-1 more as the "core" attribute set. (Again, this is a
generalization and there are some who will take issue with even this
assertion.)
A year or so ago the ZIG decided that explicit guidelines for the
development of attributes sets should be developed -- an "attribute
architecture" -- and a ZIG committee was assigned for this, chaired by
Cliff Lynch. The work of that group is complete, and the architecture is
nearly finalized. It was also (tentatively) agreed (at the time the
architecture effort was launched) that there should be a *bib-2* attribute
set developed, based on the new architecture, but that this new attribute
set would be developed by bibliographic experts (in conjunction with
Z39.50 experts, but not by Z39.50 experts alone).
So to summarize (so far) there are attribute sets developed for
specific disciplines, bibliographic (bib-1 and eventually bib-2), museum
information (CIMI), government information (GILS), geospatial (GEO),
scientific and technical information (STAS) and others. Bib-1 is based on
old, obsolete attribute architectural concepts (implicit concepts, never
explicitly articulated), bib-2 will be built on architecture that
represents the current state of thinking (it will not be developed until
the new architecture is complete). Between these two extremes, for
example, CIMI has tried to base its development on enlightened thinking,
but the completed architecture is not yet available, so CIMI has been
forced to make some assumptions about the new architecture.
[end background]
Searching on DC Access points
-----------------------------
People (from a variety of disciplines -- CIMI, GILS, geo, bibliographic)
are asking how to search on DC access points, and there have been a number
of approaches considered:
(1) Provide a mapping from DC elements to the bib-1 attribute
set, and consider this the canonical DC to Z39.50 mapping
for searching.
(2) provide mappings from DC to individual attribute sets,
including bib-1; in this case, bib-1 is considered "just
another attribute set".
(3) Define a new attribute set, the Dublin Core attribute set.
Approach (1) has been fairly well rejected. Approach (2) has stirred
up much controversy, but hasn't been completely rejected. And most
everyone participating in the discussion so far seems to agree that (3) is
a necessary step, no matter whether we do (2) or not. It is also felt that
the development of a DC attribute set should be done under the auspices of
the DC group. Cliff Lynch is prepared to discuss this at the meeting.
So it seems clear that there will be a DC set developed, and though
it may take some work, it should be a reasonably straightforward process.
What is not clear is whether there need also be mappings of DC elements to
individual attribute sets.
Issues
-------
Ralph LeVan has addressed some of the technical issues in his paper
(accessible from the DC web site) in particular, issues related to Z39.50
protocol version. So I'm not considering these issues here (and assuming
Z39.50 version 3 for this discussion).
If a well-developed DC attribute set is defined, widely accepted and
implemented *both* by Z39.50 clients and Z39.50 servers, this would go a
long way towards obviating the need to define explicit mappings from DC to
individual attribute sets and would result in simpler and more elegant
searches. Consider, for example, this scenario: A server support GILS and
CIMI databases, and supports searching the two in combination (granted, an
odd combination indeed, but useful for illustrative purposes). The user
wants to search on "author or creator". (The user may or may not know
about Dublin Core, but let's assume the client does, and that "author or
creator" is a "core element" that is offered as an access point). If the
DC attribute set is supported by both client and server, the client
formulates the search using the DC access point "author or creator". If
the DC set is not available, the client would have to construct a rather
complex query.
But the other side of the issue is this: The server is going to have
to support the GILS attribute set if it provides access to a GILS
database, and the CIMI set for a CIMI database, etc. There is no choice
here. Is the server willing to support the DC attribute set too? And this
question is most meaningful perhaps when a server only supports a single
application, say GILS. The server has to support the GILS attribute set.
Will it also support DC (based on the presumption that some of the
searches it receives will be constructed using the GILS set and others
using the DC set)? When a server supports many different types of
information and supports searches across various databases, the
incremental cost to support DC is less, and the gains are higher.
To put the question another way: in order to support the user/client
capability to search on DC core-element access points, where should the
heavier burden reside: at the client or the server/db provider? The DC
attribute solution seems to ease implementation for the client but places
a heavier burden on the server, and the "mapping" approach puts a heavier
burden on the client (while hiding the DC aspects from the server).
It would be useful, to the Z39.50 implementors, for the DC folks to
consider some of these implementation questions during the Z39.50
discussions at the meeting next week, and we welcome feedback.
Ray Denenberg
Library of Congress
202-707-5795
[log in to unmask]
|