--- Begin Forwarded Message ---
Date: Tue, 06 Jun 2000 12:48:41 +0100
From: Humphrey Southall <[log in to unmask]>
Subject: Estimating Administrative Boundaries from other data
Sender: [log in to unmask]
To: [log in to unmask], [log in to unmask], [log in to unmask]
Reply-To: Humphrey Southall <[log in to unmask]>
Message-ID: <[log in to unmask]>
Please re-post to other lists.
Estimating Administrative Boundaries from other data
====================================================
The questions raised here arise out of a European workshop on mapping
historic boundaries, held in Florence/Firenze this last weekend; for more
details of the workshop, see www.geog.port.ac.uk/hist-bound. I am hoping
that someone can point me to existing research, but if these issues have
not been investigated they could well make an interesting student project,
and I could supply relevant data.
Context
=======
While most countries in Europe posess detailed digital boundary data for
mapping census data and other information gathered in the last 20-30 years,
boundaries usually change substantially over time, and the further back in
time we go the harder it is to establish what boundaries existed. Old maps
and lists of boundary changes often exist, and it is possible to convert
them into a GIS. However, such projects are expensive and time-consuming,
and this is especially true if boundary change is a constant process and
has to be recorded in a continuous-time dynamic GIS. My own project has
almost completed this task for England and Wales for the period from 1876
onwards, at parish-level, but this has cost the best part of Pnds 1/2m. and
similar projects may be too expensive for other countries, and particularly
for individual researchers (incidentally, for details of our project, see
www.geog.port.ac.uk/gbhgis; our Civil Parish boundaries for 1876-1911 have
just been passed to UKBORDERS and should be available on-line in the next
couple of months).
The meeting in Florence concluded that a desirable first stage in the
construction of a national (or European) historical GIS is the assembly of
a systematic gazetteer and place-name thesaurus. This is a conventional
database rather than a GIS, containing definitive lists of the
administrative units that existed over time, the many different sets of
hierarchical relationships that existed between them, whatever textual
information exists about boundary changes, and information about the
variant forms of place-names; all this information needs to be provided
with date-stamps and references to the books and documents from which it
was assembled. Information on "places", meaning settlements with no
particular legal status, and natural features may or may not be part of
such a gazetteer or thesaurus.
Whether such a system is or is not a GIS, it is certainly a computer System
that contains a great deal of Information about Geography. Perhaps the
best known example of such a system is the Getty Information Institute's
Thesaurus of Geographical Names (see www.gii.getty.edu), although its
content for countries other than the United States is a bit limited and it
contains no time dimension. The Swedish national archives are developing
such a system WITH a time dimension, and there is work afoot in the UK, at
EDINA, the Data Archive and my own project. There is also a great deal of
more traditional research, generally in the archives community, which has
constructed paper-based historical gazetteers, thesaurii and place-name
authority lists; creating on-line equivalents will often involve
computerising such publications, rather than carrying out new research, and
my own project centers on computerising F. Youngs' "Local Administrative
Units of England" (Royal Historical Society, 1979 and 1991; we have the
RHS's permission for this).
The issues:
==========
The questions I am raising in this mailing are not about how to build such
computerised gazetteers/thesaurii (although I would be interested to hear
from anyone who has built such a system with a time dimension). Instead, I
am interested in how adding a LIMITED amount of locational data to such a
system could permit it to be used as a low-cost historical GIS.
Incidentally, I am quite sure that some of the more technical work my
project is doing, which involves re-districting statistical data gathered
for many different dates and reporting geographies into a single
standardised output geography REQUIRES a true boundary GIS for accurate
results. I am raising these issues now partly because not every country
and project has the money or time we have had, and partly because we are
interested in working with British data for earlier dates, long before the
start of the continuous record of boundary changes on which our GIS is built.
In what follows, the questions I am looking for answers to start at (4):
(1) A basic place-name thesaurus such as Youngs (see above) knows about
hierarchies but not about locations; it can be used to aggregate low-level
data to a higher level, and it might under some circumstances be used to
take data for one high-level set of areas (e.g. local government
districts), allocate it to lower-level components (e.g. parishes) and then
re-aggregate to a different higher-level system (e.g. Parliamentary
Constituencies). However, as it contains no geo-referencing at all it
cannot be used to create a map.
(2) If we add some basic co-ordinates for the objects in the thesaurus, we
can create simple point maps. In British history, this might well mean
adding National Grid co-ordinates for parishes, and I know of at least
three projects which have done this independently for England, based on the
locations of churches (to either 1 km. or 100 m. accuracy).
(3) If we need AREAS, so we can create choropleth maps -- and many
historical projects want to do this -- we can synthesise polygons
surrounding each point by computing Thiessen Polygons, constructed from
sets of lines equi-distant from each pair of adjacent points ( e.g. parish
churches), and at right angles to the lines linking them (see, for example,
Haggett et al, _Locational Models_ (London, 1977), pp.436-9). If we
generate a set of "parish boundaries" by constructing Thiessen polygons
around the churches, the results will obviously be very approximate indeed
for an individual parish. However, if we then use our thesaurus to combine
the estimated parishes into, say, counties, the approximation will be a
good deal better. Equally, if we plot parsh-level information onto a
choropleth map of England and Wales -- or the whole of Europe -- and print
it onto (say) A4-size paper, it will be hard to tell a set of Thiessen
polygons from the real thing (in fact, if you use ArcPlot to create the
final map, it will so generalise the real boundaries that the difference
will be barely apparent even under a magnifying glass!).
THE ISSUE IS HOW CAN WE DO BETTER THAN THIS IF WE HAVE _SOME_ MORE SPATIAL
DATA, BUT STILL NOT THE ACTUAL BOUNDARIES. NB THIS IS A KIND OF
REGION-BUILDING PROBLEM, BUT THE AIM IS NOT TO COMPUTE "OPTIMAL" BOUNDARIES
BUT TO APPROXIMATE THE BOUNDARIES THAT ACTUALLY EXISTED. WHILE THE GOAL IS
OBVIOUSLY TO CREATE ESTIMATED BOUNDARIES WHERE WE HAVE NO RECORD OF WHAT
ACTUALLY EXISTED, OR ONLY A PARTIAL RECORD, THERE IS PLENTY OF DATA ON REAL
BOUNDARIES, BOTH MODERN AND HISTORICAL, WHICH COULD BE USED TO ASSESS HOW
WELL OUR ESTIMATION PROCEDURES WORK.
(4) These problems often arise in working with historical census data, and
these often contain a figure for the area of each unit. There may well be
problems with the accuracy of these figures, but if we assume they are
correct can we use them to create better approximations than Thiessen
polygons? Assume we have both a single co-ordinate for each parish AND an
area. The co-ordinate is more likely to be central to the parish than near
an edge, but the only thing that is certain is that it lies somewhere
within the area of the parish. My guess is that estimating boundaries here
requires an interative procedure, and I am unsure what the objective
function should be although my guess is that compactness matters as well
as, obviously, the complete partitioning of the total area. IT IS NOT TOO
HARD TO OUTLINE A METHODOLOGY HERE, BUT HAS ANYONE ACTUALLY DEVELOPED ONE?
(5) Boundaries in the real world have an affinity to certain physical
features, notably rivers and the ridges of hills. Can we create better
estimated boundary lines if we have a modern coverage of waterways and
Digital Elevation Modelling data? Ignore for now the question of whether
the physical geography has changed over time. The suggestion is that
boundaries that are initially estimated by some other procedure should
"snap to" natural physical boundaries if they get close enough -- but if
you have data on true polygon areas these would still have to be adhered to.
(6) Some durable man-made features, such as Roman Roads, may also "attract"
boundaries. Anyone interested in exploring this might find a recent book
useful: "The Parish, its bounds and its divisions", ch.3 in N.J.G.Pounds,
_A History of the English Parish_ (Cambridge, 2000), pp.67-112. Figure 3.1
shows boundaries in Cambridgeshire "snapping-to" Roman Roads and an old
dyke while figure 3.3 shows them being attracted by a point feature -- a
well in an arid area.
(7) Can any existing but partial Digitised Boundary Data (DBDs) help? For
example, we may want to create estimated boundaries for parishes or French
communes, and have available DBDs for counties or departements. The use of
these when computing Thiessen polygons is pretty obvious, but how might
they be used if we also had data on parish/commune areas? A modern
coastline could be similarly used to constrain estimates of the boundaries
of historic units.
(8) We may have detailed modern DBDs as well as a historical thesaurus.
While many boundaries have changed (there were over 20,000 boundary changes
affecting the boundaries of the Civil Parishes of England and Wales between
1876 and 1974), many have stayed the same. If we took, for example, the
1981 British ward boundary file that is available from UKBORDERS, or an
equivalent dataset for another country, and linked it to a historical
gazetteer/thesaurus, could a piece of software be written that would
include those modern boundaries which seemed compatible with what we knew
about the historical geography, and replace the remainder with synthesised
boundaries? This is trivial if our gazetteer/thesaurus includes a full
record of boundary changes, but assume it does not.
(9) Lastly, and slightly facetiously, some research projects really need
accurate boundaries, but others do not; the problem is that many
traditional historians want boundary maps that LOOK accurate even when
neither we nor they have any idea where the real boundaries ran -- they
want lines that have a similar degree of complexity to those in the real
world. Could we use fractals to achieve this?
I am hoping that I will get some responses that point me to existing
research, but most of the literature I know about is concerned with
constructing optimal regions -- e.g. designing electoral districts -- not
approximating real ones. As I said at the beginning, if anyone wants to
explore these issues there is plenty of real-world data they can use. My
guess is that it will take some time to get responses, so please reply if
you receive this some time after it is first sent out. I will re-post
selected replies sent directly to me to the history-gis and hist-bound
lists (which I am the owner of).
Best wishes,
Humphrey Southall
========================================================
Dr. Humphrey Southall,
Reader in Geography,
Department of Geography,
University of Portsmouth,
Buckingham Building,
Lion Terrace,
PORTSMOUTH PO1 3HE, ENGLAND
Direct Line: (023) 92 842500
Dept. Fax: (023) 92 842512
Mobile: (0796) 808 5454
--- End Forwarded Message ---
__________________________________________________________________
Dr David Fletcher
Department of Politics and Modern History
London Guildhall University
Old Castle Street
London E1 7NT
United Kingdom E-mail: [log in to unmask]
Tel/ voicemail: +44 (0)20 7320 1025 Fax: +44 (0)20 7320 1157
__________________________________________________________________
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|