I don't know what the "industry standard" is, but I deal with large questionnaire datasets and I would not find this level acceptable - I want my data to be accurate! Obviously while I don't know what data you're dealing with, if you know that 5% is inaccurate then you can locate the discrepancies and checking against the paper copy to update only the inaccurate entries shouldn't take 2-3 weeks?? Furthermore if it is a strategic or consistent error (ie failure to reverse a coding) then you can do a find and replace or write a macro to speed things up.
I recently had to recalculate a sub-scale as I realised I had done it wrong (a 2-item subscale, I accidentally made it a mean of A+C when it should've been A+B), and it changed the significance of the regressions I was running -- so I would err on the side of caution, myself! There is already enough error in the data that we can't detect, I wouldn't go something correctable go. Esp as I've done some reading on random vs non-random missing data and that put the fear into me about screwing up the basis for my stats.
My very conscientious 2p.
Rebecca
Rebecca Graber
PhD Student
Social & Health Psychology
Institute of Psychological Sciences
Faculty of Health and Medicine
University of Leeds
[log in to unmask]
0113 343 9197
www.psyc.leeds.ac.uk/friendship
________________________________________
From: Research of postgraduate psychologists. [[log in to unmask]] On Behalf Of Mícheál [[log in to unmask]]
Sent: 09 December 2010 17:41
To: [log in to unmask]
Subject: double entry?
Hello all,
I have collected questionnaire data from about 500 people. For each person, there are about 400 data points. The whole data set has been entered into excel once, and a subset of 10% has been entered a second time by a different person. I've just compared the data sets for accuracy and 95% of the time the two data sets agree.
So my question is, is this an acceptable level of accuracy?
I could enter the data a second time and correct discrepancies against the paper copies, but its going to take 2/3 tedious weeks. Most of the items are part of longer scales so an odd mistakes is not going to change the final score much, but at the same time I don't want to miss anything interesting in my data!
Any advice welcome.
Micheal.
|