Print

Print


THIS IS VERY MUCH A QUESTION ABOUT MAP LIBRARIANSHIP, NOT GIS, BUT AN ODD ONE ...

Previous mailings to this list have mentioned that the new version of the Vision of Britain site, being developed with JISC funding, will feature something that attempts to be an on-line map library, rather than just a collection of historic maps assembled into map mosaics and presented very much as "GIS" content.  The new system will include many maps that cannot sensibly be assembled into mosaics, but all maps in the collection will be geo-referenced in the sense that the metadata includes a fairly accurate bounding rectangle.

We have a prototype interface working, which allows users to define the area they are interested in via a zoomable map and then returns all maps whose bounding rectangles intersect with the bounding rectangle defined by the user, possibly filtered by date and/or map type.

However, this can return a lot of maps and we want to rank them by relevance, which in this context means how similar is the area covered by the map to the area defined by the user: (i) maps covering only a small part of the user's area, (ii) maps which cover the user's area but much else besides, and (iii) maps covering a similar sized area but with only a small overlap are all less relevant than maps that cover exactly the same area.  The user will not be able to change the shape of "their map", which has to be more or less square, but they can zoom in and out, and pan around.

HAS ANYONE DONE ANY EXPERIMENTS WITH ALGORITHMS THAT TRY TO CAPTURE THIS NOTION OF MAP SIMILARITY?

I am thinking of computing (a) the percentage of the area of the map in the result set that is not within the user's map and (b) the percentage of the area of the user's map that is not in the map in the result set, then multiplying these two numbers together.  The most relevant maps would be those with the lowest values of this number.

However, we have to worry about the time it takes to calculate this number across a lot of maps, and I am not sure well how this algorithm will deal with the three different kinds of poor matches mentioned above.  We would love to share notes with someone who has experimented with this.

Best wishes,

Humphrey Southall

Humphrey Southall,
Reader/Director GB Historical GIS,
Department of Geography,
Buckingham Building,
University of Portsmouth,
PORTSMOUTH PO1 3HE
Historical GIS team: 023 9284 2500
Mobile: 07595 600 331