At 9:48 AM -0400 2/4/15, Tamir Israel wrote:
> ... I'm wondering if anyone is aware of any gold standard (from a civil society perspective) road maps on implementing this kind of thing? This report points to scotland/wales as a model to be emulated.
The three references at the bottom are the bleeding obvious, but they haven't been mentioned in this thread yet, and perhaps they should be. We definitely need standards for health care data, but the problem is more general than health care.
In a recent series on 'Risk Management for Big Data Projects' presented to the 8 Branches of the Australian Computer Society, I used this slide:
Anonymisation to achieve Non-Reidentifiability
- Omission of specific rows and columns
- Suppression or Generalisation of particular values and value-ranges
- Data Falsification / 'Data Perturbation'
- micro-aggregation
- swapping
- adding noise
- randomisation
I deprecate the term 'data perturbation' because it's a harmful euphemism. Rich data-sets are re-identifiable, full stop. The data has to be falsified, in order to make it recognisably unusable for administrative purposes.
Use of the honest term 'falsification' also brings focus to bear on statistical techniques whereby the data-set as a whole can retain value for defined (analytical / population-oriented) purposes.
Such techniques already exist and more can be developed. (My math stats isn't good enough for me to understand them, but then much the same statement applies to cryptography, yet we use and recommend cryptographic techniques, don't we!).
1. UKICO (2012) 'Anonymisation: managing data protection risk: code of practice' Information Commissioners Office, November 2012, at http://ico.org.uk/for_organisations/data_protection/topic_guides/~/media/documents/library/Data_Protection/Practical_application/anonymisation-codev2.pdf
in particular Annexes 1-3, incl:
Yang M., Sassone V. & O'Hara K. (2012) 'Practical examples of some anonymisation techniques' Annex 3, pp. 80-103
2. Slee T. (2011) 'Data Anonymization and Re-identification: Some Basics Of Data Privacy: Why Personally Identifiable Information is irrelevant' Whimsley, September 2011, at http://tomslee.net/2011/09/data-anonymization-and-re-identification-some-basics-of-data-privacy.html
3. (Less good, but Americans like to have American references):
DHHS (2012) 'Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule' Department of Health & Human Services, November 2012, at http://www.hhs.gov/ocr/privacy/hipaa/understanding/coveredentities/De-identification/guidance.html
--
Roger Clarke http://www.rogerclarke.com/
Xamax Consultancy Pty Ltd 78 Sidaway St, Chapman ACT 2611 AUSTRALIA
Tel: +61 2 6288 6916 http://about.me/roger.clarke
mailto:[log in to unmask] http://www.xamax.com.au/
Visiting Professor in the Faculty of Law University of N.S.W.
Visiting Professor in Computer Science Australian National University
****************************************************
This is a message from the SURVEILLANCE listserv
for research and teaching in surveillance studies.
To unsubscribe, please send the following message to
<[log in to unmask]>:
UNSUBSCRIBE SURVEILLANCE
For further help, please visit:
http://www.jiscmail.ac.uk/help
****************************************************
|