Print

Print


The British Standards Institute has produced a useful guide to the use
of personal data in system testing, endorsed by TICO.

Nic

-----Original Message-----
From: Graham Hadfield [mailto:[log in to unmask]] 
Sent: 13 July 2004 13:17
To: [log in to unmask]
Subject: Re: Data Matching

I've copied below a short paper I did recently on live data and testing
which I hope will be of some use. I often say yes but as far as using
live
data for testing goes I base my "No" on sound principles.

Regards,
Graham


Reasons not to use live data for testing.


General Testing Values.


Testing, by definition, must use specially constructed and documented
data
which looks at each condition within the system specification and which
allows the outputs to be verified against results which can be reliably
predicted because the inputs are known. The inputs contained within live
data are not known, which means that the outputs cannot be predicted. It
follows, therefore, that live data are not suitable for use for testing
purposes.


"Anonymising" of live data.


It is sometimes suggested that a valid test dataset can be produced by
removing from a live dataset all field values which would identify
individuals. There are two fundamental problems with such a suggestion:
   the contents of the resulting dataset would not be known in detail,
   which means that the expected outputs cannot be predicted and,
   therefore, the dataset is not suitable for use for testing purposes
(see
   above);
   there is no way of proving that the individual records making up the
   dataset have been fully anonymised without each record being
inspected,
   which means that data may be exposed to staff in breach of the
seventh
   Data Protection Principle (see below).


Parallel Running.


It is important to understand the difference between testing and
parallel
running. Testing is used to ensure that the code within the programs
making
up the system is correct and that the different parts of the system
interact correctly (e.g. that production running will not be adversely
affected by badly designed database access routines; that totals carried
forward from one run are correctly brought forward by the next one).
Parallel running, on the other hand, is used to verify data integrity.
Parallel running emphatically does not include repeated runs over a
number
of months in order to test code.


The only situation in which the use of a copy of live database is valid,
therefore, is as the final "parallel run" stage of implementation of a
system amendment such as a change to the file/database structure (or a
replacement of one system by another) where (say) a monthly payroll is
rerun to ensure that the structure change has not resulted in data
corruption and that the outputs are the same as from the actual live run
which took place before the structure change. Any copies of live data
used
in such a parallel run must be ring-fenced so that it can be destroyed
appropriately rather than becoming mixed up with the real live data.


Prior to any such parallel run taking place the conversion (and any
other)
program(s) should have been tested as for any other code - i.e. using
specially constructed and documented test data which tests each
condition
within the specification and which allows the outputs to be verified
against results which can be reliably predicted because the inputs are
known.


The Significance of the Data Protection Principles.


Data Protection Principle 1. Personal data shall be processed fairly and
lawfully and, in particular, shall not be processed unless:
   at least one of the conditions in Schedule 2 of the DPA is met, and
   in the case of sensitive personal data, at least one of the
conditions
   in Schedule 3 of the DPA is also met.
The only conditions in Schedules 2 and 3 which could allow usage of
personal data for testing are those which rely on the consent of the
data
subject. As the consent of the data subject for this usage has not been
sought in relation to any personal data held by the Council then to use
personal data for testing would be a breach of Principle 1.


Data Protection Principle 2. Personal data shall be obtained only for
one
or more specified and lawful purposes, and shall not be further
processed
in any manner incompatible with that purpose or those purposes.
Usage for testing has not been specified as a purpose in any case where
personal data has been obtained by the Council. Such usage would,
therefore, be a breach of Principle 2.


Data Protection Principle 4.  Personal data shall be accurate and, where
necessary, kept up to date.
If personal data were to be used for testing there would be a
significant
chance that inaccurate results could be obtained from the processing and
that those results could be mixed up with real results from live running
?
which would be a breach of Principle 4.


Data Protection Principle 7.  Appropriate technical and organisational
measures shall be taken against unauthorised or unlawful processing of
personal data and against accidental loss or destruction of, or damage
to,
personal data.
Usage of personal data for testing would expose those data to access by
staff who have no legitimate right to see them because their job would
not
involve doing so were it not for the testing. As in relation to
Principle
4, there is a danger that inaccurate test results could be mixed up with
real results from live running and that they could be used as if real.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       All archives of messages are stored permanently and are
      available to the world wide web community at large at
      http://www.jiscmail.ac.uk/lists/data-protection.html
      If you wish to leave this list please send the command
       leave data-protection to [log in to unmask]
            All user commands can be found at : -
        http://www.jiscmail.ac.uk/help/commandref.htm
  (all commands go to [log in to unmask] not the list please)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       All archives of messages are stored permanently and are
      available to the world wide web community at large at
      http://www.jiscmail.ac.uk/lists/data-protection.html
      If you wish to leave this list please send the command
       leave data-protection to [log in to unmask]
            All user commands can be found at : -
        http://www.jiscmail.ac.uk/help/commandref.htm
  (all commands go to [log in to unmask] not the list please)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^