Today's AWAD (A-Word-A-Day) is Impute. As a connoisseur of irony I was delighted by the definition and etymology, which you can find here today:
http://www.wordsmith.org/words/today.html
(Tomorrow, you will need to use the link to "yesterday". After that, you can find it in the archives.)
The discussion around "Why did the CMAJ publish this paper?" seems to have given rise to some puzzlement about multiple imputation.
Multiple imputation is a smart way of filling in the blanks left by missing data. Statisticians and trialists love it, because it calculates narrower confidence intervals. But, there is a big but (with one T). Give me a dataset and a modern statistical package, and I will use multiple imputation to calculate all the confidence intervals you like. However, without close attention to the assumptions this would by like putting me in the driving seat of a formula 1 car - I would go off the track at the first corner.
After the financial crash of 2008, any commuter on the 685 bus would be wary about investing their pension in a CDO (collaterized debt obligation) where statistical wizardry has buried risk so deep that no clever economist, genius investment banker, or bright regulator could see it.
After the embarrasing failure of the first QRISK model to identify a link between cholesterol and cardiovascular risk - a missing link that clever epidemiologists, genius medical statisticans, bright peer reviewers, and sharp journal editors failed to see as a red flag for errors in analysis - and then the subsequent tweaking of the multiple imputation method to provide a more convenient result, any investor in trial results dependent on statistical wizardry should be wary. Trust it only if the statistical wizardry has been used with care and its methods and assumptions reported in detail.
http://www.bmj.com/rapid-response/2011/11/01/multiple-imputation-needs-be-used-care-and-reported-detail
There have been a number of calls in general medical journals for transparency in the reporting of muliple imputation:
http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3319761&tool=pmcentrez&rendertype=abstract
http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2714692&tool=pmcentrez&rendertype=abstract
http://www.ncbi.nlm.nih.gov/pubmed/20831627
http://archpedi.jamanetwork.com/article.aspx?articleid=1686979
The conclusion of these reviews is consistent:
Transparent reporting of methods is one essential.
The other is an exploration of the reasons for data going missing.
The naturopathic medicine trial reported by Seely et al in the CMAJ lacks both essentials, and this makes me wonder about the missingness (as Peter Cummings calls it.
http://archpedi.jamanetwork.com/article.aspx?articleid=1686979
How was it possible to not collect 100% of the baseline data for measurements and tests done during the participant's visit? E.g. only 221 of the 246 participants had their waist circumference measured, and only 206 had LDL cholesterol measured (see Table 1).
Blood tests were analyzed using a point-of-care device. If there was too little blood in a sample to do all the tests, the postal workers could have had another finger stabbed? I am under the impression that Canadians are particularly tough and stoical, and that naturopaths are particularly persuasive.
Kev, thanks for your private peer review.
My remarks about the proportion of missing data were unclear and a somewhat garbled.
I got 30% from Figure 1 which shows that there was data from (124 + 122) participants at the start and (82 + 87) at the end. Note that "with data" seems to mean "with some data", not "with all data".
The ITT was carried out on (106 + 101) participants, which means that week 52 data would have had to be imputed for at least (82 + 87) participants, i.e. missingness rate > 18%.
Michael
|