HIERARCHIES:
There was extensive email discussion about this on the
dc-coverage reflector, and we did not come to consensus.
Here are some of the reasons why not:
a. Everyone agreed that the majority of the human race
is going to refer to places by name, not by geog. coordinates.
The reason for the coordinates is that it's a much more
exact method of searching.
b. How extensive should the hierarchy be? Believe it or
not, this is a tough one. Those of us in the cataloging world
do it the lazy way - we let the Anglo-American Cataloguing
Rules and (in the U.S.) the Library of Congress figure it
out for us; and anyone who uses any sort of geographic-name
thesaurus (e.g., the to-come Thesaurus that the Getty is
working on) does the same. E.g., is it:
Earth - United States of America - Colorado - Jefferson County - Golden?
or just: United States of America - Colorado - Golden? etc.
(If you follow Library of Congress, you go for: Golden (Colo.) )
c. If one decides on a hierachy - then how does one get
persons to use it? My experience in libraries tells me that
an information user is very centered on a topic and doesn't
realize that there could ever be a conflict - and a do-it-yourself
metaloger probably isn't going to be interested in typing out
a full hierarchy that seems like overkill.
d. If we're talking about free-text search - does it matter
which direction a hierarchy goes? (Larger area to smaller area or
smaller area to larger area) In theory, what order the metaloger
types in the words shouldn't make any difference in search
capabilities. (Yes, I realize there are many reasons why
It Matters - this is devil's-advocate time).
e. What seems obvious depends upon who you are and what's
home to you. Persons who live in Virginia perceive "VA" as
being enough of an identifier; persons who live in Gabon
might well need: United States of America - Virginia.
And so on.
These are all problems that have been argued over long ago
in the cataloging world, and we've come up with something
that - while it is not perfect - works in that world.
Nothing is perfect; the 90-95% solution is not bad at all.
POSTAL CODES: this is easily handled in Coverage as a
scheme - e.g., "U.S.ZipCode". Andrew Daviel has reasons
why postal codes ain't perfect.
GLOBAL POSITIONING SYSTEMS: GPS are highly precise systems,
and again I see this as Just Another Scheme. And here again,
I turn this over to Andrew (he and I had an email exchange
on this about ?a year? ago)
FORM OF GEOGRAPHIC NAME: Another one of those areas to which
the cataloging world has given considerable thought.
In the U.S., catalogers accept first of all whatever's in LCNAF/LCSH;
and for the many place names not there, the source for both
non-U.S. and U.S. names is the U.S. Board of Geographic Names.
(We also use U.S. Geological Survey as a source for placenames
in the U.S. that do not yet appear in BGN). Both of these
gazetteers are available over the Web; several countries have
Web gazetteers available (Arthur provided the Website for Australian placenames
on this list just recently; Canada has a site; any other
countries?) Basic idea is to accept the form used by the
locals (generally using only the Roman alphabet) where possible, at the same
time accepting reality and knowing that the chances of
getting the U.S. public to call Rome "Roma" are not real good.
(BGN sticks to its guns and calls it "Roma", with a cross
ref. from "Rome - conventional name.")
Mary Larsgaard
------- Forwarded Message
Return-Path: [log in to unmask]
Received: from majordom by gizmo.lut.ac.uk with local (Exim 1.61 #1)
id 0xGSO1-0005Bl-00; Wed, 1 Oct 1997 18:16:33 +0100
Received: from mail.slc.edu [198.83.6.253]
by gizmo.lut.ac.uk with esmtp (Exim 1.61 #1)
id 0xGSNw-0005Bg-00; Wed, 1 Oct 1997 18:16:28 +0100
Received: from [192.168.1.117] (popjaw.slc.edu [198.83.6.169])
by mail.slc.edu (8.8.5/8.8.5) with ESMTP id NAA21897;
Wed, 1 Oct 1997 13:16:22 -0400
Message-Id: <l03110706b05841c9c65d@[192.168.1.117]>
In-Reply-To:
<[log in to unmask]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Portfolio: http://comet.net/personal/jordan/portfolio/
X-Quote: "If...you can't be a good example, then you'll just have to be a
horrible warning." Catherine Aird
Date: Wed, 1 Oct 1997 13:23:17 -0500
To: "Weibel,Stu" <[log in to unmask]>
From: Jordan Reiter <[log in to unmask]>
Subject: RE: DATE and PUBLISHER element definition change proposal
Cc: "'Misha Wolf'" <[log in to unmask]>, meta2 <[log in to unmask]>
Sender: [log in to unmask]
Precedence: bulk
Weibel,Stu felt an urge to reveal at 3:24 AM -0500 on 1997-09-30:
> Mary's Coverage workgroup paper adopts a strongly cartesian or
> foot-print view of coverage.
>
> I can also imagine it being used in an unqualified free-text mode...
> coverage = Columbus, Ohio
>
> very loose semantics, indeed. Useful? Probably less so than a more
> strongly-typed version. Eric mentioned to me that some substantial
> percentage (40%?) of all web searches are for local resources.
> <red-neck-accent>Don't need no bounding box to find a list of all the
> Holiday Inns in Columbus.</red-neck-accent>
I'm not sure what you mean by "strongly-typed" version, but I think *very*
few people doing any kind of search will type in the exact geographic
location (longitude, latitude, and minutes) of that place--just the name.
I think it would be better to set up a clear syntax of location based on
geographic names. I swear we had this thread earlier, and I said something
along the lines of "We should have a sort of hierarchal system"--preferably
top-down.
For example, you might put "US-VA-Charlottesville-Downtown Mall". This
would indicate that the general coverage is the US, then more specifically
the state of Virginia, then the city of Charlottesville, then the area
known as the Downtown Mall. This would be perfect for a page describing
attractions/shops on the Downtown Mall in Charlottesville, VA US. I think
that using standardized postal abbreviations for upper-hierarchal locations
makes the most sense--in my example, Virginia should only be indicated by
VA, not Va or other variants.
However, the last item *shouldn't* be abbreviated. For example, if I
specified simply a document covering Virginia, then it should be
"US-Virginia". Why the change? Think of how the usual search terms would
be used. If you're looking for something in a town, you might type the
name of the town and the abbreviation of the state, ie: "Richmond VA". But
if you're doing a search for something in the state of Virignia, you'd
probably type "Virginia" in full. The same applies if the coverage is for
the entire US--people might type in US, but they're much more likely to
type in "United States".
The geographic coverage could then have many levels of granularity, similar
to the ISO DATE's granularity. While writing out records as "Town, Region,
Country" may be more familliar, since metadata is not specifically for
humans to look at but instead for record keeping, it doesn't really matter.
Some other options: Use a zip-code schema (this would work for US--I don't
know how many countries use zip codes though, or similar things). Also,
apparently there's this global positioning systems standard, although that
seems a little to specific.
The name of countries and cities should correspond to the language of the
document. For example, a document in Italian should use "Roma", while a
document in English should use "Rome". This way, the documents found will
most likely correspond to the user's language preferences as well.
- --------------------------------------------------------
[ Jordan Reiter ]
[ mailto:[log in to unmask] ]
[ "Don't you realize that intellectual people ]
[ are all ignorant because they can't spray ]
[ paint that small?" ]
- --------------------------------------------------------
------- End of Forwarded Message
|