JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for ALLSTAT Archives


ALLSTAT Archives

ALLSTAT Archives


allstat@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Monospaced Font

LISTSERV Archives

LISTSERV Archives

ALLSTAT Home

ALLSTAT Home

ALLSTAT  July 2009

ALLSTAT July 2009

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: Typing error rates..... Summary of replies........

From:

"Szydlo, Richard M" <[log in to unmask]>

Reply-To:

Szydlo, Richard M

Date:

Tue, 14 Jul 2009 16:53:26 +0100

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (88 lines)

 
Dear All,

Thanks very much to all who replied to my query, I have had numerous helpful suggestions, and also requests for further information. I have to confess to being somewhat naive in my thinking that what we had done was 'straightforward'!!

Our audit consisted of 2 data managers looking through 5 patients notes each, and filling out a questionnaire for each patient. They then checked the questionnaire data against that which was previously collected and entered into a database. So, I inadvertently misled everyone into thinking we had conducted an audit of 'typing-in' data! The reality is that we conducted an audit that combined data collecting accuracy with data entry accuracy - we found 3 'major' diagnostic errors, 4 'minor' errors, and 16 typos or inconsequential date / time errors. As one responder pointed out, it is the 'clinical relevance' of the errors that is important. On that basis, we had 3/504 serious errors (0.6%). The point of the audit was to set a benchmark (against external sources if possible), and also as an internal control for the future. In fact the audit was very informative, for although we did "achieve" an overall error-rate of 4.6% (and hence considerable room for improvement next time), it did highlight problems of 'trusting' clinicians summaries and the need for using primary sources of information.

Thanks once again - I've learnt a lot!!

Richard


Below are a couple of good papers and other contributions to the discussion...

The recent study below, goes into good detail with regards to the different sorts of errors / accuracy etc and provides a 'best' error-rate of 2.3%, "a number more consistent with previous literature reports."

J Am Board Fam Med. 2007 Mar-Apr;20(2):151-9.
 
The "Measuring Outcomes of Clinical Connectivity" (MOCC) trial: investigating data entry errors in the Electronic Primary Care Research Network (ePCRN).Fontaine P, Mendenhall TJ, Peterson K, Speedie SM.
Department of Family Medicine and Community Health, University of Minnesota Medical School, 925 Delaware Street Southeast, Minneapolis

INTRODUCTION: The electronic Primary Care Research Network (ePCRN) enrolled PBRN researchers in a feasibility trial to test the functionality of the network's electronic architecture and investigate error rates associated with two data entry strategies used in clinical trials. METHODS: PBRN physicians and research assistants who registered with the ePCRN were eligible to participate. After online consent and randomization, participants viewed simulated patient records, presented as either abstracted data (short form) or progress notes (long form). Participants transcribed 50 data elements onto electronic case report forms (CRFs) without integrated field restrictions. Data errors were analyzed. RESULTS: Ten geographically dispersed PBRNs enrolled 100 members and completed the study in less than 7 weeks. The estimated overall error rate if field restrictions had been applied was 2.3%. Participants entering data from the short form had a higher rate of correctly entered data fields (94.5% vs 90.8%, P = .004) and significantly more error-free records (P = .003). CONCLUSIONS: Feasibility outcomes integral to completion of an Internet-based, multisite study were successfully achieved. Further development of programmable electronic safeguards is indicated. The error analysis conducted in this study will aid design of specific field restrictions for electronic CRFs, an important component of clinical trial management systems.


Whilst for 'real' typing errors the study below sets a good benchmark,

Is double data entry necessary? The CHART trials. D Gibson et al MRC, Cambridge, England.
Abstract - There is some controversy over the need for double data entry in clinical trials. In particular, does the number and types of errors identified with this approach justify the extra effort involved? We report the results of a study carried out to address this question. Our main outcome measure was the frequency and types of errors involved in the entry of data for the CHART (continuous, hyperfractionated, accelerated radio-therapy) trials. Data were reentered for a sample of 44 patients by a data manager other than the one making the initial entry. The second entry was then compared with the first entry. The error rate for the two entries combined was 14 per 10,000 data items (fields) (95% confidence interval 10,19). The error rate for the initial entry alone was 15 per 10,000 fields (95% confidence interval 9.5, 22), and the vital/important error rate (defined as any error on a principal outcome measure or a major error on any other endpoint or variable) was 2.5 per 10,000 fields (95% confidence interval 0.68, 6.4). On this evidence double data entry is not performed for the CHART trials.


Some other contributions....

You might find some useful information in the subject area of genetic linkage mapping. I did some work on the impact of errors on linkage maps a few years ago, and plenty of other people have worked on this too. As far as I remember, a good starting point is KH Buetow 1991 Influence of aberrant observations on high-resolution linkage analysis outcomes, Am J Hum Genet 49: 985-994.

 
A thought - have you considered the possibility of monitoring the error rate at regular intervals and plotting the data on a Shewhart control chart? Surely the important thing is to improve rather than compare your performance with that of others?


I guess that your problem was due to the use of the key-word "typing error" (if you did that) that would lead to many articles in genetics. I rather searched, in scopus (www.scopus.com), among the available titles, the words "data entry error" and I ended up with the following references. I suppose you will have similar results searching on different engines.
Fontaine, P., Mendenhall, T.J., Peterson, K., Speedie, S.M.
Erratum: The 'Measuring Outcomes of Clinical Connectivity' (MOCC) trial: Investigating data entry errors in the Electronic Primary Care Research Network (ePCRN) (2007) Journal of the American Board of Family Medicine, 20 (4), p. 426.

Kaneko, H., Fujiwara, E. A Class of M-Ary Asymmetric Symbol Error Correcting Codes for Data Entry Devices (2004) IEEE Transactions on Computers, 53 (2), pp. 159-167. .

Kawado, M., Hinotsu, S., Matsuyama, Y., Yamaguchi, T., Hashimoto, S., Ohashi, Y.
A comparison of error detection rates between the reading aloud method and the double data entry method (2003) Controlled Clinical Trials, 24 (5), pp. 560-569. .

Kohler, H.-P., Rodgers, J.L. DF-analyses of heritability with double-entry twin data: Asymptotic standard errors and efficient estimation (2001) Behavior Genetics, 31 (2), pp. 179-191.

PDM lets equipment maker say good-bye to data-entry errors
(2001) Machine Design, 73 (10), p. 38.


"Sorry, I don't have a refference, but as far as I know 5% is pretty standard."


If you type 'keystroke error rates' or 'data entry error rate' into Google, you'll find a fair bit.

One important thing to note is that, for fairly obvious reasons, what is normally measured is 'keystroke error' rate - whereas it sounds as if you may be talking about error rate in terms of entry fields (most of which probably consist of several characters/keystrokes).

Single-entry keystroke error-rates are usually in the range 2% - 5%,
depending on the skill of the operators, the nature/quality/clarity of the data and to some extent the nature/quality of the user interface. If you are talking about keystroke error rate, then your 4.6% is therefore just about within the 'common range' - but if you are talking in terms of entry field errors, your keystroke error rate is probably considerably less than 4.6%, and hence in the lower part of the common range.

Reconciled double-entry error rates are often in the range 0.02% - 0.04% in terms of keystrokes, which usually equates to around 0.1% -0.2% in terms of field errors (assuming an average of 5 keystrokes per field, with only one keystroke error per field). Those figures are obviously in roughly the right ball-park, given 2-5% error for single entry, but one needs to remember that first- and second-entry errors are obviously not going to be totally independent (in fact may be highly correlated in some cases of unclear/ambiguous source data.

 

We tend to use a <1% level for acceptability, however this is a cell wide error rate rather than key stroke (so saying someone is 22 rather than 11 is one error not two). Our usual approach is to do a sample of double data entry and re -enter all data if above 1%. However sometimes it can be clearly identified that there is one variable where the errors are caused in that case we would usually only resolve the problem with that variable. Afraid that I don't have any references, just thought I would share what we do!



I just checked with the IS unit of the University where I used to work (Aga Khan University, Karachi) and there the error rate acceptable for analysis is 0.03%. This is after data cleaning. May be it is of help
 


I don't know of any references but I work in randomised controlled trials in Africa and we aim for less than 0.5% of data values as being acceptable. So, based on this your entry clerks would need re-training and, or reassignment.
 
However, sometimes if there are lots of string variables or a data collection form is really complicated, error rates can end up being artificially high because caps lock wasn't on or something.
 


I'm not sure how much sense it makes to seek a single standard for data entry. Your figure would be appalling in a life-critical situation but compares very favourably with me typing in my PC password. You need to consider the importance of the task and the incentives.

If as I suspect you want to encourage your data managers to do a better job then I'd suggest this is a motivational issue not a statistical one. Try to involve them in the task, illustrate its importance, thank them when they do a good job. I have tried telling people that they are substandard, but found that on its own it is not a very constructive approach!



Typists and secretaries used to take RSA qualifications that specified maximum acceptable error rates for copy typing. I can't find this now on the RSA website but try your library - or ask a secretary if such now exist outside VC's offices.

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

May 2024
April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
2006
2005
2004
2003
2002
2001
2000
1999
1998


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager