The British Standards Institute has produced a useful guide to the use of personal data in system testing, endorsed by TICO. Nic -----Original Message----- From: Graham Hadfield [mailto:[log in to unmask]] Sent: 13 July 2004 13:17 To: [log in to unmask] Subject: Re: Data Matching I've copied below a short paper I did recently on live data and testing which I hope will be of some use. I often say yes but as far as using live data for testing goes I base my "No" on sound principles. Regards, Graham Reasons not to use live data for testing. General Testing Values. Testing, by definition, must use specially constructed and documented data which looks at each condition within the system specification and which allows the outputs to be verified against results which can be reliably predicted because the inputs are known. The inputs contained within live data are not known, which means that the outputs cannot be predicted. It follows, therefore, that live data are not suitable for use for testing purposes. "Anonymising" of live data. It is sometimes suggested that a valid test dataset can be produced by removing from a live dataset all field values which would identify individuals. There are two fundamental problems with such a suggestion: the contents of the resulting dataset would not be known in detail, which means that the expected outputs cannot be predicted and, therefore, the dataset is not suitable for use for testing purposes (see above); there is no way of proving that the individual records making up the dataset have been fully anonymised without each record being inspected, which means that data may be exposed to staff in breach of the seventh Data Protection Principle (see below). Parallel Running. It is important to understand the difference between testing and parallel running. Testing is used to ensure that the code within the programs making up the system is correct and that the different parts of the system interact correctly (e.g. that production running will not be adversely affected by badly designed database access routines; that totals carried forward from one run are correctly brought forward by the next one). Parallel running, on the other hand, is used to verify data integrity. Parallel running emphatically does not include repeated runs over a number of months in order to test code. The only situation in which the use of a copy of live database is valid, therefore, is as the final "parallel run" stage of implementation of a system amendment such as a change to the file/database structure (or a replacement of one system by another) where (say) a monthly payroll is rerun to ensure that the structure change has not resulted in data corruption and that the outputs are the same as from the actual live run which took place before the structure change. Any copies of live data used in such a parallel run must be ring-fenced so that it can be destroyed appropriately rather than becoming mixed up with the real live data. Prior to any such parallel run taking place the conversion (and any other) program(s) should have been tested as for any other code - i.e. using specially constructed and documented test data which tests each condition within the specification and which allows the outputs to be verified against results which can be reliably predicted because the inputs are known. The Significance of the Data Protection Principles. Data Protection Principle 1. Personal data shall be processed fairly and lawfully and, in particular, shall not be processed unless: at least one of the conditions in Schedule 2 of the DPA is met, and in the case of sensitive personal data, at least one of the conditions in Schedule 3 of the DPA is also met. The only conditions in Schedules 2 and 3 which could allow usage of personal data for testing are those which rely on the consent of the data subject. As the consent of the data subject for this usage has not been sought in relation to any personal data held by the Council then to use personal data for testing would be a breach of Principle 1. Data Protection Principle 2. Personal data shall be obtained only for one or more specified and lawful purposes, and shall not be further processed in any manner incompatible with that purpose or those purposes. Usage for testing has not been specified as a purpose in any case where personal data has been obtained by the Council. Such usage would, therefore, be a breach of Principle 2. Data Protection Principle 4. Personal data shall be accurate and, where necessary, kept up to date. If personal data were to be used for testing there would be a significant chance that inaccurate results could be obtained from the processing and that those results could be mixed up with real results from live running ? which would be a breach of Principle 4. Data Protection Principle 7. Appropriate technical and organisational measures shall be taken against unauthorised or unlawful processing of personal data and against accidental loss or destruction of, or damage to, personal data. Usage of personal data for testing would expose those data to access by staff who have no legitimate right to see them because their job would not involve doing so were it not for the testing. As in relation to Principle 4, there is a danger that inaccurate test results could be mixed up with real results from live running and that they could be used as if real. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ All archives of messages are stored permanently and are available to the world wide web community at large at http://www.jiscmail.ac.uk/lists/data-protection.html If you wish to leave this list please send the command leave data-protection to [log in to unmask] All user commands can be found at : - http://www.jiscmail.ac.uk/help/commandref.htm (all commands go to [log in to unmask] not the list please) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ All archives of messages are stored permanently and are available to the world wide web community at large at http://www.jiscmail.ac.uk/lists/data-protection.html If you wish to leave this list please send the command leave data-protection to [log in to unmask] All user commands can be found at : - http://www.jiscmail.ac.uk/help/commandref.htm (all commands go to [log in to unmask] not the list please) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^