In article <[log in to unmask]>, Lee,
Edmund <[log in to unmask]> writes
>One point that
>we might want to consider is that the most common subject indexing
>terminologies (Dewey Decimal, Library of Congress and the UNESCO thesaurus)
>are all (as far as I know) single hierarchy thesauri (ie they do not allow a
>term to have more than one broad term, contrary to recommendations in the
>British Standard for thesaurus construction).
>
>This may have implications for user perception of the value of thesauri. If
>the most common thesauri only have this limited functionality, there is a
>risk that people see thesauri generally as not meeting their needs for
>indexing.
>
>Any thoughts?
In article <[log in to unmask]>, Trevor Reynolds
<[log in to unmask]> writes
>
>The AAT is also a single hierarchy thesaurus (possibly because it follows the
>ANSI thesaurus standard?)
>
>Personally I think that poly-heirarchical thesuari are more likely to be of use
>to a general user.
Let's be careful that we are not comparing apples with oranges here. The
Dewey Decimal Classification (DDC), Library of Congress Subject Headings
(LCSH) and thesauri such as the UNESCO thesaurus are different types of
subject indexing systems, which work in different ways, and they cannot
be compared directly. They are respectively a classification scheme, a
system of pre-coordinated alphabetical subject headings and a thesaurus.
The concept of polyhierarchy strictly applies only to the last of these,
because the others do not depend on the hierarchical genus-species
(BT/NT) relationships which are fundamental to a thesaurus.
Classification schemes
----------------------
These are used to group items so as to bring related items together and
to arrange them in a helpful sequence on real or virtual shelves
(displays or printed lists) to facilitate browsing and navigation.
All the major classification schemes, such as DDC, UDC, LCC and Bliss
(see URLs below) group concepts primarily by discipline, so that a
subject such as "horses" will appear, in DDC for example, in the
sections dealing with biology, animal husbandry and sport, among others.
The choice of classification for a particular record will depend on how
that record treats the subject, with one specific place being designated
as the location for general works on the subject. The alphabetical index
that should be constructed as part of the work of classifying records
will provide access to all of the places where the subject is to be
found. This is not, strictly speaking, polyhierarchy, but is the
equivalent in the context of a classification scheme.
DDC <http://www.oclc.org/dewey/index.htm>
UDC <http://www.udcc.org/about.htm>
LCC <http://lcweb.loc.gov/catdir/cpso/lcco/lcco.html> and see
the pilot test site, available till 31st May 2000, at
<http://www.lccweb.net/>
Bliss <http://www.sid.cam.ac.uk/bca/bcahome.htm>
Alphabetical subject headings
-----------------------------
The best known scheme of pre-coordinated alphabetical subject headings
is Library of Congress Subject Headings (LCSH) (not to be confused with
the Library of Congress Classification (LCC) referred to above). This is
most conveniently accessible at the pilot test site, available till 31st
May 2000 at <http://www.lccweb.net/>
LCSH was developed before the principles of information retrieval
thesaurus construction were worked out and documented in standards, so
although it is gradually evolving towards these standards there are
still many anomalies. The thesaurus features that are being introduced
do allow multiple broader terms, so to that extent it is
polyhierarchical, e.g.
Horses
BT Domestic animals
Equus
Livestock
and
Horses--Paces, gaits, etc.
BT Animal locomotion
Gait in animals
Horsemanship
Examples of relationships which do not conform to thesaurus principles
(narrower terms that are not specific types of the broader term) are
Horses
NT Photography of horses
Travel with horses
The main difference between LCSH and a standard thesaurus is that it is
designed for pre-coordination, i.e. terms are combined into strings to
express compound subject at the time of indexing, rather than terms
being assigned separately and combined only in a search statement (post-
coordination). Thus LCSH has entries such as
Horses--Feeding and feeds--Recipes
BT Cookery
Horses--Religious aspects--Christianity
NT Four Horsemen of the Apocalypse
The rules for constructing these strings are somewhat complex, as only
some terms can be used as subheadings. In many cases it is also
permitted to add terms to the string to express place, date and form.
The scheme is intended to provide specific access to subjects as well as
bringing related material in a useful sequence for browsing. It does
this for the first term in each string, but later terms are scattered;
for example, the last heading shown above would not be found listed
among items on Christianity. Also, because the listing is alphabetical
rather than classified, related terms are scattered - for young horses
you have to look under "F" for "foals".
Thesauri
--------
I think it is best to use the word "thesaurus" only for files of
indexing terms that comply with the national and international standards
for thesaurus construction:
British standard guide to establishment and development of
monolingual thesauri / British Standards Institution. - 1st rev. -
London : BSI, 1987. - 32p ; 30cm. - (BS5723:1987) (ISO2788-1986)
British standard guide to establishment and development of
multilingual thesauri / British Standards Institution. - London :
BSI, 1985. - 63p ; 30cm. - (BS6723:1985) (ISO5964-1985)
Guidelines for the construction, format, and management of
monolingual thesauri / developed by the national Information
Standards Organization : approved August 30, 1993, by the American
National Standards Institute. - Bethesda, Maryland : NISO Press,
1994. - 84p. ; 28cm. - (National Information standards series,
ISSN 1041-5653; ANSI/NISO Z39.19-1993(R1998)). - ISBN:
1-880124-04-1 : $55.00. Available for free download in PDF format
at <http://www.techstreet.com/cgi-bin/detail?product_id=52601>
Both the UK/ISO and the ANSI/NISO standards allow polyhierarchical
relationships. The current edition of the UNESCO thesaurus uses them
only for terms relating to named places, and the AAT does not use them
in its current edition. The AAT Web site states explicitly that this is
because of technical limitations [presumably in the software they use]:
"The AAT is conceptually "polyhierarchical", meaning that a
concept may be placed in two different sections of the hierarchy.
However, the data is physically "monohierarchical" due to current
technical limitations. The polyhierarchy will be physically
realized in 2001."
<http://www.getty.edu/research/tools/vocabulary/aat/about.html>
Thesauri are most commonly designed for "post-coordinate" searching; as
many terms as are appropriate are assigned to each document being
indexed, but these terms are not combined into strings and the
relationship between the terms is not normally indicated. The last LCSH
example above might therefore just be given the two terms "horses" and
"Christianity" without linking these terms. This allows any separate
term to be searched for as easily as any other, either individually or
in an implicit or explicit Boolean search statement such as
"(horses AND Christianity)".
Within a properly constructed thesaurus, all narrower terms will be
"kinds of" the broader term, so that good search software should allow a
search to be "exploded" to retrieve a term and all its narrower terms. A
search of this kind for "horses" would then retrieve documents that had
been indexed with the terms "foals", "cart horses", "stallions",
"mares", "ponies" and so on.
It is this ability to expand searches that makes polyhierarchy
important, as it ensures that a term will be retrieved by an exploded
search of any of its broader terms. It also allows a specific term to be
found by indexers or searchers who may use different paths to navigate
the thesaurus.
I'm sorry that this has turned out a rather longer message than I meant,
but I found it difficult to resist Edmund's challenging "Any thoughts?"
There is a lot more that could be said, but I'll restrain myself.
I agree with Edmund and Trevor that polyhierarchy is useful and
important, and I think that the main reason that it is not more widely
available is due to the limitations of the software being used for
thesaurus development and for searching. This is unfortunate, because
there is software available for both functions that can cope with it
perfectly well.
Leonard Will
--
Willpower Information (Partners: Dr Leonard D Will, Sheena E Will)
Information Management Consultants Tel: +44 (0)20 8372 0092
27 Calshot Way, Enfield, Middlesex EN2 7BQ, UK. Fax: +44 (0)20 8372 0094
[log in to unmask] [log in to unmask]
---------------- <URL:http://www.willpowerinfo.co.uk/> -----------------
|