Dear colleagues,
Please find attached a status report (a rather long one, I'm afraid) on
development of International Standard Collection Identifier. Comments
are welcome.
As said in the report, following the ISO TC 46 meeting in Washington
last November Helsinki University Library has a mandate of making a New
Work Item proposal of ISCI to ISO. Let me know if you are interested in
participating in the writing of the proposal during the next few weeks.
Best regards,
Juha Hakala
******************************************************************'
Development of International Standard Collection Identifier
Juha Hakala
Helsinki University Library – The National Library of Finland 2005-01-25
ISO TC46 Meeting in Washington, DC November 2004 made a decision to ask
the Finnish Standards Association (SFS) to prepare a new work item for
International Standard Collection Identifier, ISCI. The decision was
based on a discussion paper written in Helsinki University Library – The
National Library of Finland, available at
http://www.collectionscanada.ca/iso/tc46sc9/docs/sc9n394.pdf.
Since the national library is chairing the Finnish committee responsible
of ISO TC46 standards, SFS has delegated the task of writing the NWI
proposal to the library. Our aim is to complete the proposal before the
end of February 2005.
Background
There is an “identifier gap” between ISIL, International Standard
Identifier for Libraries and Related Organizations, and traditional
bibliographic identifiers such as ISSN (serials) and ISBN (books). As
long as collection descriptions were seldom done, at least in machine
readable form, this was not too much of a problem. But metasearch
applications require collection level metadata, and especially when this
data is shared in national or international scale, we need to identify
these collections and/or metadata records describing them. A unique
identifier will enable e.g. efficient searching of collections, and
reliable duplicate control.
Although the need for standard collection identifier is clear, it is
less obvious how to construct such identifiers. In the most abstract
level, ISCI could be either “dumb”, that is, an ISSN-like string of
characters which gives no hint as regards which organization has
assigned it, or “intelligent”. In the latter case, ISCI would
incorporate a string of characters indicating the organization which has
assigned the code.
ISCI: architectural principles
The ISCI NWI proposal will be based on the following two principles:
1. ISCI should be an “intelligent” identifier
The reason for this choice is the very large number of organizations –
libraries, archives, museums etc. which may at some point assign
identifiers for their collections. A dumb code would require a strong
international centre controlling and assisting the work in national
level, and a network of national centers, each assigning blocks of ISCIs
to organizations willing to assign them. Establishing this
infrastructure would be costly, and usage of ISCI would not spread
quickly into developing countries. For instance, ISSN network covers
approximately 80 countries, and some of them have been unable to pay
their annual fees. Funding the activities of a large ISCI international
centre could become a problem.
If ISCI is constructed in such a way that identifier assignment can be
delegated to national level and beyond to individual organizations, the
task of coordinating the system in international and national level
becomes much more manageable. There is no longer a need to assign ISCI
blocks centrally, and maintain a central database of them. The role of
the international centre would be primarily that of informing potential
users about the system.
2. Extending ISCI into URN must be simple
Uniform Resource Name (URN) serves dual function of identification and
resolution. In simple terms, the latter means that it should be possible
to type URN into the Location window of a browser; in return a user gets
either the resource itself, or information about it, or just its present
location (URL) in the Web. In principle any identifier can be converted
into URN, which consists of string URN:, Namespace Identifier (e.g.
ISBN:) and Namespace Specific String, which is the actual identifier.
For instance, ISSN 1458-4387 can be expressed as URN:ISSN:1458-4387, and
this URN could be resolved in the ISSN database into bibliographic data
about the serial, including its Web address (which is
http://www.helsinki.fi/atk/lehdet).
Any URN can be resolved, but the task is easier if the URN can assist in
locating the appropriate resolution service. For instance, resolving
ISSN-based URNs would not be possible without the ISSN database in
Paris, containing approximately million serial records, for each
periodical which has an ISSN. On the other hand, ISBNs indicate language
and/or country, and the difficulty of resolving ISBN-based URNs varies
from simple (951-ISBNs can all be resolved in the Finnish national
bibliography) to rather complicated (German language area, ISBN:s
beginning with 3, may be resolved either in German, Austrian or Swiss
national bibliography).
ISCI: central features
We shall propose International Standard Collection Identifier consisting
of three parts:
* Identifier for the organization
* A character separating organization and collection identifiers
(most likely “_”)
* Internal identifier for a collection owned by the organization
The identifier for organizations will be ISIL, International Standard
Identifier for Libraries and Related Organizations (see
http://www.bs.dk/isil). This choice has some obvious benefits.
* Each organization having an ISIL identifier (for instance,
Helsinki University Library’s ISIL is FI-H) can assign ISCI’s with
no support from ISCI international or national agency.
* The resulting system is extensible to beyond the library domain,
to basically all organizations which have collections that can be
described.
* Given the structure of ISIL, each ISCI-based URN will contain a
hint as regards how to resolve it, and this hint will also be
understandable for humans, including layman who will be able to
understand at least the country where the collection is physically
located.
There are arguments against using ISIL and part of collection
identifier. ISIL-based ISCI does not provide any information about where
the items in the collection originate from (unless this information is
provided in the internal collection identifier). For instance, the
Slavic collection of Helsinki University Library, built when the library
had legal deposit right in the Russian empire in 19^th century, will
look “Finnish”. But the problem with our Slavic collection and many
other collections is that they may never have been “national”, or if
they have been, they no longer are. Our Slavic collection originates
from what was Russia in 19^th century. Geographical coverage of present
Russia is quite different; it does no longer include, among other
things, Poland, Baltic countries, and Finland.
It may also be claimed that acquiring an ISIL may not be possible for
all organizations interested in describing their collections. But Danish
National Library Authority, which hosts the International ISIL centre,
has made it clear that they will not prevent any organization from
getting an ISIL code as long as they have a valid need for it.
Separating character should be one which requires no conversion when
ISCIs are expanded into URNs. The choice of such characters which are
not allowed in ISIL is limited; from this set, “_” is probably the best
choice.
Many organizations have assigned internal names or identifiers for their
collections. These are often well established and mnemonic. For
instance, Helsinki University Library’s Slavic collection is Slavica,
and national (legal deposit) collection Fennica. Using these names as
part of the identifier will simplify the identifier assignment process.
ISCI rules and guidelines will be set up in such a way that these names
(converted to roman characters, if need be) can be used as part of the
identifier.
Following these guidelines, Helsinki University Library’s Slavica
collection would get the following ISCI:
FI-H_Slavica
which could be expressed as the following URN:
URN:ISCI:FI-H_Slavica
provided that the Namespace Identifier for ISCI will be “ISCI”.
Namespace registration for ISCI can not start before the ISCI
standardization process in ISO has reached at least Draft International
Standard level.
Future steps
Once the New Work Item proposal has been left and – hopefully -
accepted, the real standardization work will begin. A new working group
under ISO TC46 Sub-committee 9 (Identification and Description) will be
formed. It will spend X months in herding the proposal through the
various steps (Committee draft, Draft International Standard and finally
Final Draft International Standard, if necessary). Although some of
these steps can be passed if the proposal is deemed acceptable, the
process may still require a lot of time. It is realistic to assume that
once the working group is formed the work will take about two years.
--
****************************************************
Juha Hakala
Director, Information Technology
Helsinki University Library
tel +358 9 191 44293 fax +358 9 753 9514
internet: [log in to unmask]
*****************************************************
|