The problem is that technologists always talk about fitting data into a common set of data fields. This is what most aggregators do and of course important information is squeezed out in this process.
The challenge for harmonising data, even data from completely different specialisms, is not to try and fit it into a common set of data fields and/or limit the data that you can represent. It is to find a common set of generalisations about the data. It is these generalisations that enable data to be represented in such a way that means completely different datasets can be meaningfully connected.
Once data is harmonised in these terms specialisms can be surfaced, but you must make the right contextual connections first. This is what the CIDOC CRM does. For example, it is perfectly possible to harmonise a natural history collection with general collections of artificial objects because although some of the data may be different, the generalisations that the CRM provides brings this data together – and there is a huge amount to learn from doing so.
Once you have contextual data you then also start to solve problems of terminology co-referencing and matching people, periods and places. Terms, people and places may differ from organisation to organisation but if they are connected to objects, and these objects are described contextually and are harmonised (and therefore can be compared independently), then co-referencing also becomes more than just a typographical exercise – and could also allow us to populate in our missing data.
There are always compromises when bringing data together. The question is, how far you are willing to compromise and to what extent do you want to limit the purposes for which your data can be used. We would rather keep compromise to a minimum and thereby open up our data to a wider range of activity, from quantitative and qualitative research to supporting initiatives like Europeana, all the way to creating a more interesting e-cards. All within a cost effective and sustainable COPE approach (important in these times).
CIDOC CRM is not complicated - our data is. Technologists can tend to by-pass the issue of difficult and complex data and homogenise. This might be OK for some activities but not for many others.
A detailed manual on how the BM has mapped its data to CRM will be available very soon.
Dominic
www.researchspace.org
@researchspace
****************************************************************
website: http://museumscomputergroup.org.uk/
Twitter: http://www.twitter.com/ukmcg
Facebook: http://www.facebook.com/museumscomputergroup
[un]subscribe: http://museumscomputergroup.org.uk/email-list/
****************************************************************
|