Elinor,
I don't know about Microsoft SQL or SAS, but MySQL certainly has a SOUNDEX
function. Go to http://dev.mysql.com/doc/refman/5.0/en/index.html and then
search for SOUNDEX - it is the second link that explains its use.
Andrew
==============================
Andrew Fenyo, Senior Computing Officer
Personal Social Services Research Unit
University of Kent, Canterbury, CT2 7NF
Voice: 01227 827610 Fax: 01227 827038
PSSRU: http://www.pssru.ac.uk
-----Original Message-----
From: A UK-based worldwide e-mail broadcast system mailing list
[mailto:[log in to unmask]]On Behalf Of Elinor Curnow
Sent: 13 December 2005 17:01
To: [log in to unmask]
Subject: Matching datasets
Dear all,
I have a database of demographic and lifestyle data, bought in from a
third party, that I am trying to match to a consumer database. I am
currently using postcode, surname, initial and title (and subsets of
these) to match the 2 datasets. I use aggregated data for consumers that
can only be matched by postcode.
I would like to improve the number of 1-1 matches (no duplicates) I can
make between the 2 datasets and also decrease the number of matches made
only at postcode level.
I have full name and address available to me in both datasets, but not
gender or date of birth.
I have thought about Soundex codes for surnames to overcome data quality
issues.
Can anyone suggest a method/useful references that would achieve the
desired improvements, ideally using SAS or SQL?
Many thanks,
Elinor
|