Print

Print


Dear Claudia

I agree with you that there is an issue about over-investigation, over-treatment and ‘over-diagnosis’.  Traditionally doctors have been more worried about the opposite: under-diagnosis, under-investigation or under treatment due to ignorance or laziness.  Currently there seems more concern about over-enthusiasm, often due to ignorance or motivated by greed to sell tests or treatments that the patient does not need.

I think that that the problem is due to a failure to use symptoms, physical signs, test results and patient preferences in a logical and systematic way based on appropriate research to optimise the selection of treatments that suit patients best.  In order to understand where things go wrong, we should consider the various steps:

  1.   (a) To identify symptoms and their different severities, speed of onset and durations that need to be investigated and those that need not

 (b) To identify which patients with a screening test result need to be investigated as in (1a) (in this sense screening tests are like symptoms)

  2.  To identify the possible differential diagnoses for symptoms, signs and test results (that I will call ’findings’) and the frequency with which diagnoses occur in such findings in different settings, including benign self limiting conditions that should not be ‘over-treated’

  3.  To identify findings that occur commonly in association with diagnoses in such lists and rarely or never in others in the list (i.e. ratios of ‘sensitivities’, not ‘likelihood ratios’)

  4.  To use information in (2 and 3) in a logical way by a process of careful probabilistic elimination to estimate accurately the probabilities of a diagnosis or diagnoses for that patient (I do not think that applying Bayes rule to pre-test probabilities, sensitivities, specificities and likelihood ratios can do this properly)

  5.  To identify which findings should be used as ‘sufficient’ and ‘necessary’ diagnostic criteria to decide who does have and who does not have a ‘diagnosis’ (e.g. diabetes mellitus).  This will depend which findings explain best what has happened to a patient so far and which will  predict best what will happen to a range of outcomes with and without interventions (e.g. the various complications of diabetes mellitus)

  6.  To identify findings that refine, or sub-divide or ‘stratify’ diagnostic criteria, including those that assess severity of conditions, to improve the accuracy of individual predictions about outcome with or without intervention (e.g. who will develop diabetic nephropathy with and without ACE inhibitors or ARBs whose outcomes are better on insulin, etc.)

  7.  To share with patients the different possible outcomes at each step of test and treatment decisions and to explore how these outcomes would affect the individual patient's  over-all well being (this can be modelled by Decision Analysis if desired, but this can be very time consuming and taxing for the patient)

  8.  To identify markers of progress that are able to predict or reflect patients’ well-being in order to facilitate repeated reviews of diagnoses or treatments by a process of feed-back in order to optimise patients benefit, thus ‘personalising’ their care (e.g. HbA1c monitoring of diabetes)

I explain much of this in the Oxford Handbook of Clinical Diagnosis, the latest 3rd edition being published in September 2014.

In terms of your example, cardiac catheterisation may be performed unnecessarily because steps 2, 3 and 4 above have not been done appropriately.  This could happen because the benign causes of the ST segment changes were not considered appropriately (e.g. incorrect positioning of ECG chest leads) and this ‘diagnosis’ shown to be very probable or confirmed by using appropriate diagnostic criteria (e.g. such as moving them to the correct position and observing the change to normal in ST segment configuration).  If this were done properly, then it would be clear that cardiac catheterisation or some other investigation was not indicated.  However, I would not expect a competent and experienced cardiologist to make such a mistake.  Nevertheless, the data required for steps 1 to 7 are not available for most if any situations, and we have to make decisions based on guessing what the data might be if we did the appropriate research.

The main problems of over-diagnosis and over-treatment seem to occur because of inappropriate interpretation of screening test results in asymptomatic patients.  For example, this might mean that many of the ‘cancers’ diagnosed are not cancers at all or that they are very early and will regress without treatment or that they are benign versions that would never cause the patient harm (e.g. prostate cancer in older men).  This is due to failure to consider an appropriate differential diagnosis form steps 1 to 6 above.

With best wishes

Huw Llewelyn MD FRCP, Consultant Physician in endocrinology, internal and acute medicine, Honorary Fellow in Mathematics, Aberystwyth University, UK

________________________________

From: Claudia Bugeja <[log in to unmask]>
Sent: 20 February 2016 13:21
To: Huw Llewelyn [hul2]
Subject: Re: Pre-test probability

Thanks for your quick reply.

As we are all aware of, something technology may be misleading for the HCP, apart from being beneficial as well.

Currently, I am tackling a study unit about the main issues that health care professionals face with regards to ICT and medical devices. Once, I read that numerous false alarms occur that must be evaluated by healthcare professionals so that over treatment of patients will not occur. Examples of over treatment are reported in the literature and include unnecessary cardiac catheterisation because of false ST-segment monitor alarms, as well as unnecessary electrophysiology testing and device implantation because of muscle artefact simulating ventricular tachycardia.

Do you agree with this point, and have you ever experienced such difficulty in your clinical setting?

Moreober, another student commented on the following post and said that he disagrees with me and gave his own point of view.
I don’t agree with you as regards unnecessary cardiac catheterisation procedures due to erroneous data produced when patients attach wrongly ECG monitoring devices. According to the American Association of Critical-Care Nurses (2009) the cardiologist will not intervene invasively just on the basis of an ECG monitoring device report. ECG monitoring devices are used to capture any cardiac arrhythmia event, thus cardiologist aren’t concerned if they find ST segment changes since waveforms can easily differ when electrodes are relocated away from their original location. Ironically, HCP encourage patients to move slightly the electrodes when these are changed daily to prevent skin irritation.
ECG information obtained from electrodes located close to the heart (precordial leads) is especially prone to waveform changes when the electrodes are relocated as little as 1 cm away from the original locations.

I know that he is right to an extent, but I would like to justify my argument. any help on what to include?


On 20 February 2016 at 14:16, Huw Llewelyn <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Yes of course.

Please ask your question(s)

Huw Llewelyn
________________________________
From: Claudia Bugeja <[log in to unmask]<mailto:[log in to unmask]>>
Date: Sat, 20 Feb 2016 11:00:26 +0100
To: Huw Llewelyn [hul2]<[log in to unmask]<mailto:[log in to unmask]>>
Subject: Re: Pre-test probability

Hi ,

I am a medical student, and would like to ask you something about ECG monitoring ann unneccesary interventions.

Do you think you can help me out?

Thanks
Claudia

On 19 February 2016 at 23:46, Huw Llewelyn [hul2] <[log in to unmask]<mailto:[log in to unmask]>> wrote:

Thank you for raising these interesting points about the problems associated with estimating post-test probabilities from pre-test probabilities.  Instead of reasoning using simple Bayes rule with likelihood ratios based on those ‘with a diagnosis’ and ‘without a diagnosis’, physicians like me reason with lists of differential diagnosis based on the extended form of Bayes rule.  For example, instead of ‘appendicitis’ or ‘no appendicitis’,we consider appendicitis or cholecystitis, salpingitis, mesenteric adenitis, ‘non-specific abdominal pain’ etc and use ratios of their ‘sensitivities’ (e.g. “guarding is common in appendicitis but less common NSAP”) to estimate probabilities.  I explain this in the Oxford Handbook of Clinical Diagnosis.
In contrast, Bayes rule and multiplying pre-test probabilities several times with likelihood ratios often gives wildly over-confident probabilities (e.g. 0.999, when 75% are correct).  Perhaps the real answer is that it is Bayes rule with the independence assumption that is no good at estimating disease or event probabilities (not physicians)!  The mistake may be assuming that the calculated probabilities are 'correct' and that any probabilities that differ from these are 'incorrect'.  I would be grateful therefore if you could point out in your references a comparison of the calibration curves to assess the accuracy of probabilities generated using pre / post test probabilities using multiple products of likelihood ratios compared to the curves of physicians’ estimates of probabilities.
Huw Llewelyn MD FRCP
Consultant Physician in endocrinology, acute and internal medicine
Honorary Fellow in Mathematics, Aberystwyth University



________________________________
From: Huw Llewelyn [hul2]
Sent: 19 February 2016 21:35
To: [log in to unmask]<mailto:[log in to unmask]>; Poses, Roy
Subject: Re: Pre-test probability


Thank for raising these interesting points about the problems associated with estimating post-test probabilities from pre-test probabilities.  Instead of reasoning using simple Bayes rule with likelihood ratios based on those ‘with a diagnosis’ and ‘without a diagnosis’, they reason with lists of differential diagnosis based on the extended form of Bayes rule.  For example, instead of ‘appendicitis’ or ‘no appendicitis’, they consider appendicitis or cholecystitis, salpingitis, mesenteric adenitis, ‘non-specific abdominal pain’ etc and use ratios of their ‘sensitivities’ (e.g. “guarding is common in appendicitis but less common NSAP”) to estimate probabilities.  I explain this in the Oxford Handbook of Clinical Diagnosis.  In contrast, Bayes rule and multiplying pre-test probabilities several times with likelihood ratios often gives wildly over-confident probabilities (e.g. 0.999, when 75% are correct).  Perhaps the real answer is that it is Bayes rule with the independence assumption that is no good at estimating disease or event probabilities (not physicians)!  The mistake may be assuming that the calculated probabilities are 'correct' and that any probabilities that differ from these are 'incorrect'.  I would be grateful therefore if you could point out in your references a comparison of the calibration curves to assess the accuracy of probabilities generated using pre / post test probabilities using multiple products of likelihood ratios compared to the curves of physicians’ estimates of probabilities.

Huw Llewelyn MD FRCP

Consultant Physician in endocrinology, acute and internal medicine

Honorary Fellow in Mathematics, Aberystwyth University

________________________________

From: Evidence based health (EBH) <[log in to unmask]<mailto:[log in to unmask]>> on behalf of Poses, Roy <[log in to unmask]<mailto:[log in to unmask]>>
Sent: 19 February 2016 18:19
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Pre-test probability

This is a fairly good bibliography, but it's from 2009...

Cognitive Barriers to Evidence-Based Practice

Judgment and Decision Making
Bushyhead JB, Christensen-Szalanski JJ. Feedback and the illusion of validity in a medical clinic. Med Decis Making 1981; 1: 115-123.
Coughlan R, Connolly T. Predicting affective responses to unexpected outcomes. Org Behav Human Decis Proc 2001; 85: 211-225.
Dawes RM, Faust D, Mechi PE.  Clinical versus actuarial judgment.  Science 1989;243:1668-74.
Dawson NV, Arkes HR.  Systematic errors in medical decision making: judgment limitations. J Gen Intern Med 1987;2:183-7.
Hammond KR, Hamm RM, Grassia J, Pearson T. Direct comparison of the efficacy of intuitive and analytical cognition in expert judgment. IEEE Trans Systems Man Cybernetics 1987; SMC-17: 753-770.
Kern L, Doherty ME. "Pseudo-diagnosticity" in an idealized medical problem-solving environment. J Med Educ 1982; 57: 100-104.
Lyman CH, Balducci L. Overestimation of test effects in clinical judgment. J Cancer Educ 1993; 8: 297-307.
MacKillop WJ, Quirt CF. Measuring the accuracy of prognostic judgments in oncology. J Clin Epidemiol 1997; 50: 21-29
Mitchell TR, Beach LR. "Do I love thee? let me count ..." toward an understanding of intuitive and automatic decision making. Org Behav Human Decis Proc 1990; 47: 1-20.
Payne JW, Johnson EJ, Bettman JR, Coupey E. Understanding contingent choice: a computer simulation approach. IEEE Trans Systems Man Cybernetics 1990; 20: 296-309.
Poses RM, Cebul RD, Collins M, Fager SS.  The accuracy of experienced physicians' probability estimates for patients with sore throats.  JAMA 1985; 254:925-929.
Poses RM, Anthony M. Availability, wishful thinking, and physicians' diagnostic judgments for patients with suspected bacteremia. Med Decis Making 1991;11:159-168.
Poses RM, Bekes C, Copare F, Scott WE.  The answer to "what are my chances, doctor?"  depends on whom is asked: prognostic disagreement and inaccuracy for critically ill patients.  Crit Care Med 1989; 17: 827-833.
Poses RM, McClish DK, Bekes C, Scott WE, Morley JN. Ego bias, reverse ego bias, and physicians' prognostic judgments for critically ill patients. Crit Care Med 1991; 19: 1533-1539.
Reyna VF, Brainerd CJ. Fuzzy-trace thoery and framing effects in choice: gist extraction, truncation, and conversion. J Behav Decis Making 1991; 4: 249-262.
Shulman KA, Escarce JE, Eisenberg JM, Hershey JC, Young MJ, McCarthy DM, Williams SV.  Assessing physicians' estimates of the probability of coronary artery disease: the influence of patient characteristics.  Med Decis Making 1992;12:109-14.
Tetlock PE, Kristel OV, Elson SB, Green MC, Lerner JS. The psychology of the unthinkable: taboo trade-offs, forbidden base rates, and heretical counterfactuals. J Pers Social Psych 2000; 78: 853-870
Wallsten TS. Physician and medical student bias in evaluating diagnostic information. Med Decis Making 1981; 1: 145-164.

Stress
Ben Zur H, Breznitz SJ. The effect of time pressure on choice behavior. Acta Psychol 1981; 47: 89-104.
Harrison Y, Horne JA. One night of sleep loss impairs innovative thinking and flexible decision making. Org Behav Human Decis Proc 1999; 78: 128-145.
Kienan G. Decision making under stress: scanning of alternatives under controllable and uncontrollable threats. J Pers Social Psych 1987; 52: 639-644.
Koelher JJ, Gershoff AD. Betrayal aversion: when agents of protection become agents of harm. Org Behav Human Decis Porc 2003; 90: 244-261.
Zakay D, Wooler S. Time pressure, training and decision effectiveness. Ergonomics 1984; 27: 273-284.

Improving Judgments and Decisions
Arkes HR. Impediments to accurate clinical judgment and possible ways to minimize their impact. In Arkes HR, Hammond KR, editors. Judgment and Decision Making: An Interdisciplinary Reader. Cambridge: Cambridge University Press, 1986. pp. 582-592.
Arkes HR, Christensen C, Lai C, Blumer C. Two methods of reducing overconfidence. Org Behav Human Decis Proc 1987; 39: 133-144.
Clemen RT. Combining forecasts: a review and annotated bibliography. Int J Forecast 1989; 5: 559-583.
Coomarasamy A, Khan KS. What is the evidence that postgraduate teaching in evidence based medicine changes anything?: a systematic review. Brit Med J 2004; 329: 1017-1019.
Corey GA, Merenstein JH. Applying the acute ischemic heart disease predictive instrument. J Fam Pract 1987; 25: 127-133.
Davidoff F, Goodspeed R, Clive J. Changing test ordering behavior: a randomized controlled trial comparing probabilistic reasoning with cost-containment education. Med Care 1989; 27: 45-58.
de Dombal FT, Leaper DJ, Horrocks JC, Staniland JR, McCann AP. Human and computer-aided diagnosis of abdominal pain: further report with emphasis on performance of clinicians. Brit Med J 1974; 1: 376-380.
Doherty ME, Balzer WK. Cognitive feedback.  In Brehmer B, Joyce CRB, editors. Human Judgment: the SJT View. Amsterdam: Elsevier Science Publishers,, 1988.  pp. 163-197.
Fryback DG, Thornbury JR. Informal use of decision theory to improve radiological patient management. Radiology 1978; 129: 385-388.
Gigerenzer G. How to improve Bayesian reasoning without instruction: frequency formats. Psychol Rev 1995; 102: 684-704.
Gigerenzer G. The psychology of good judgment: frequency formats and simple algorithms. Med Decis Making 1996; 16: 273-280.
Green ML. Evidence-based medicine training in internal medicine residency programs: a national survey. J Gen Intern Med 2000; 15: 129-133.
Hansen DE, Helgeson JG. The effects of statistical training on choice heuristics under uncertainty. J Behav Decis Making 1996; 9: 41-57.
Koriat A, Lichtenstein S, Fischoff B.  Reasons for confidence.  J Exp Psychol Human Learn Memory 1980;6:107-118.
Kray LJ, Galinksky AD. The debiasing effect of counterfactual mind-sets: increasing the search for disconfirmatory information in group decisions. Org Behav Human Decis Proc 2003; 91: 69-81.
Lloyd FJ, Reyna VF. A web exercise in evidence-based medicine using cognitive theory. J Gen Intern Med 2001; 16: 94-99.
Nisbett R, editor. Rules for Reasoning.  Hillsdale, NJ: Lawrence Erlbaum Associates, 1993.
Poses RM, Bekes C, Winkler RL, Scott WE, Copare FJ. Are two (inexperienced) heads better than one (experienced) head? - averaging house officers' prognostic judgments for critically ill patients.  Arch Intern Med 1990; 150: 1874-1878.
Poses RM, Cebul RD, Wigton RS, Centor RM, Collins M, Fleischli G. A controlled trial of a method to improve physicians' diagnostic judgments: an application to pharyngitis. Acad Med 1992; 67: 345-347.
Schulz-Hardt S, Jochims M, Frey D. Productive conflict in group decision making: genuine and contrived dissent as strategies to counteract biased information seeking. Org Behav Human Decis Proc 2002; 88: 563-586.
Selker HP, Besharsky JR, Griffith JL, Aufderheide TP, Ballin DS, Bernard SA, et al.  Use of the acute cardiac iscehmia time-insensitive predictive instrument (ACI-TIPI) to assist with triage of patients with chest pain or other symptoms suggestive of acute cardiac ischemia: a multicenter, controlled clinical trial.  Ann Intern Med 1998;129:845-855.
Siegel-Jacobs K, Yates JF. Effects of procedural and outcome accountability on judgment quality. Org Behav Human Decis Proc 1996; 65: 1-17.
Spiegel CT, Kemp BA, Newman MA, Birnbaum PS, Alter Cl. Modification of decision-making behavior of third-year medical students.  J Med Educ 1982; 57: 769-777.
Stewart TR, Heideman KF, Moninger WR, Reagan-Cirincione P. Effects of improved information on the components of skill in weather forecasting. Org Behav Human Decis Proc 1992; 53: 107-134.
Todd P, Benbasat I. Inducing compensatory information processing through decision aids that facilitate effort reduction: an experimental assessment. J Behav Decis Making 2000; 13: 91-106.
Tape TG, Kripal J, Wigton RS. Comparing methods of learning clinical prediction from case simulations. Med Decis Making 1992; 12: 213-221.

Useful Texts on Judgment and Decision Psychology
Cooksey RW. Judgment Analysis: Theory, Methods, and Applications.  San Diego: Academic Press, 1996.
Hogarth RM. Judgment and Choice: The Psychology of Decisions, 2nd edition.  New York, John Wiley and Sons: 1988.  pp. 62-86.
Kahneman D, Slovic P, Tversky A. Judgment Under Uncertainty: Heuristics and Biases. Cambridge, UK: Cambridge University Press, 1982.
Wright G, Ayton P. Subjective Probability. Chichester, UK: John Wiley and Sons, 1994.
Yates JF. Judgment and Decision Making. Englewood Cliffs, NJ: Prentice Hall, 1990.


On Fri, Feb 19, 2016 at 10:44 AM, Cristian Baicus <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Thank you very much, Roy, for your excellent comment!

Yes, I'm interested by a few references!

Best wishes,
Cristian.

dr. Cristian Baicus
www.baicus.ro<http://www.baicus.ro>

 from my iPad

On 19 feb. 2016, at 5:36 p.m., Poses, Roy <[log in to unmask]<mailto:[log in to unmask]>> wrote:

The simple answer is that physicians are not good at estimating disease or event probabilities.  There is a large literature on this, going back to the 1970s.

This is the Achilles heel of the attempt to promote rational decision making based on simple mathematical models.  It is not that there is doubt about Bayes Therorem.  There should be lots of doubt about the data plugged into it, though.

Cognitive psychologists have been studying human limitations in making judgments such as probability estimates for even longer, and most of what they have found probably applies to physicians.

It is not clear how physicians actually make such estimates in paticular cases.  It could be anything from pure intuition, to pattern recognition, to multivariate processes (one point for this, two for that, etc), to formal Bayesian calculation, use of prediction/ diagnostic rules, etc.  (But keep in mind that many such rules do not perform well when applied to new populations.

There are quite a few studies, some of which I did a long time ago, to show that physicians' probabilistic diagnostic or prognostic judgments are not very accurate, and physicians have been shown to be subject to judgment biases, to misuse judgment heuristics, and to rely on non-diagnostic or non-predictive variables and/or fail to take into account predictive or diagnostic variables in specific cases.

If anyone is really interested, I could drag out a host of references, many not so new.

On Fri, Feb 19, 2016 at 10:02 AM, Brown Michael <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Whether physicians are aware of it or not, they use a Bayesian approach in their daily practice when they estimate the patient's probability of having condition X based on elements of the history and physical (i.e., pretest probability) before ordering any diagnostic tests. If available for condition X, a clinical prediction rule may be used. Although this process is very far from an exact science, it is often good enough to move the clinician's suspicion above the treatment threshold or below the diagnostic threshold (alternative diagnoses considered). Although most of us would like to see things fit a more more exact mathematical formula, it is rare (at least in emergency medicine) to be able to make very precise probability estimates at the individual patient-level.

Mike

Michael Brown, MD, MSc
Professor and Chair, Emergency Medicine
Michigan State University College of Human Medicine

[log in to unmask]<mailto:[log in to unmask]>
cell: 616-490-0920<tel:616-490-0920>




On Feb 19, 2016, at 5:47 AM, Kevin Galbraith <[log in to unmask]<mailto:[log in to unmask]>> wrote:

> Hi there
>
> Can anyone advise: when calculating post-test probability of a diagnosis using the likelihood ratio for a diagnostic test, how do we make our best estimate of pre-test probability?
>
> I understand that prevalence is often taken as a pragmatic estimate of pre-test probability. But I assume a patient who presents with symptoms of the condition has, by definition, a pre-test probability that is greater than the prevalence in the wider (or preferably age/sex specific) population.
>
> To estimate pre-test probability, are we reliant on finding an estimate from an epidemiological study whose subjects most closely reflect the characteristics of our individual patient? This would seem a serious limitation to the utility of the Bayesian approach.
>
> Thanks
>
> Kevin Galbraith






--
Roy M. Poses MD FACP
President
Foundation for Integrity and Responsibility in Medicine (FIRM)
[log in to unmask]<mailto:[log in to unmask]>
Clinical Associate Professor of Medicine
Alpert Medical School, Brown University
[log in to unmask]<mailto:[log in to unmask]>

"He knew right then he was too far from home." - Bob Seger