Here is an explanaton CI in terms of the cat's eye by Geoff in the comment section of the article:

In my book < www.thenewstatistics.com > I discuss 6 ways to think about CIs. My favourite is based on 'the cat's eye picture' of a CI. This is the likelihood function placed on the interval, the fattest part of the bulge being at the centre of the CI, and the likelihood decreasing smoothly towards (and beyond) the two limits of the CI. It's in Chap 4 in the book, and Figure 2 in this recently released article: http://tiny.cc/tnswhyhow

The cat's eye pic is, imho, the beautiful inner shape of a CI! It tells us that our best bets for the true value of the parameter lie around the centre of the CI, and the bets steadily get worse the further we are from the centre of the CI.

I'm not saying we should use that pic for every CI we publish in reports of our research, but it can be helpful to bear it in mind when thinking about any CI.

Yours in estimation,
Geoff



On Thu, Nov 14, 2013 at 11:14 AM, Huw Llewelyn [hul2] <[log in to unmask]> wrote:
The probability of replicating a study result can be arrived at by showing that the probability of non-replication due to each of the various causes of non-replication (eg chance due to the number of observations made) is low. See the 4th paragraph of my Oxford University Press blog:

http://blog.oup.com/2013/09/medical-diagnosis-reasoning-probable-elimination/

From: Richard Hockey <[log in to unmask]>
Sender: "Evidence based health (EBH)" <[log in to unmask]>
Date: Thu, 14 Nov 2013 00:37:18 +0000
To: <[log in to unmask]>
ReplyTo: Richard Hockey <[log in to unmask]>
Subject: Re: Interesting reading from Nature

More on this here:

http://theconversation.com/give-p-a-chance-significance-testing-is-misunderstood-20207

R

 

From: Evidence based health (EBH) [mailto:[log in to unmask]] On Behalf Of Anoop B
Sent: Thursday, 14 November 2013 9:31 AM
To: [log in to unmask]
Subject: Re: Interesting reading from Nature

 

Hi Teresa,

 

Here is an article by Geoff Cummings: https://theconversation.com/the-problem-with-p-values-how-significant-are-they-really-20029. It has a nice video about the dance of p- values.  I have asked him to give his feedback  here.

 

The cat's eye view of CI is one way to interpret CI's by Geoff Cumming in his book: http://www.amazon.com/Understanding-The-New-Statistics-Meta-Analysis/dp/041587968X . The best book I have come across and that has free excel software which shows you everything that he talks in the book. The cat's eye view of CI shows how the plausibility varies across the CI. 

 

On Wed, Nov 13, 2013 at 5:31 PM, Benson, Teresa <[log in to unmask]> wrote:

So what we really should use to display confidence intervals is not just a straight line, but a bell-shaped curve showing the likelihood that any given value is the Truth.  Any of you Knowledge Transfer experts know how to demonstrate this concept to a group of statistics-phobic nurses? J

 

Teresa Benson, MA, LP

Clinical Lead, Evidence-Based Medicine

McKesson Health Solutions
18211 Yorkshire Ave

Prior Lake, MN  55372

[log in to unmask]
Phone: 1-952-226-4033

 

Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.

 

 

From: Evidence based health (EBH) [mailto:[log in to unmask]] On Behalf Of Anoop B
Sent: Wednesday, November 13, 2013 5:16 PM


To: [log in to unmask]
Subject: Re: Interesting reading from Nature

 

Excellent article James!

 

I was wondering why there is a mention of p value when we talk about confidence intervals.  By mentioning statistical significance if it cross 1, we are gong back to the same problem we had with p values. I have noticed this even in the evidence based workbook by Paul Glasziou.

 

One thing I have read is that the replication of p values is very poor. Means if we try to replicate a study, the p value could be any number! But when use CI, the replicability is 83%. Also what people don't mention much is that the likelihood of the true mean is much higher at the point estimate (7 times) of the CI than at the ends. So even if the lower end of CI is close to no difference, the likelihood that the mean will fall here is  7 times lower than than the point estimate. 

 

On Wed, Nov 13, 2013 at 4:51 PM, McCormack, James <[log in to unmask]> wrote:

Hi Brian - thanks for the comments on our paper. If it was me and I was the ruler of the world I would get people to state the following (very rough and just off the top of my head). 

 

In my opinion anytime one puts in subjective adjectives like unimportant/important/borderline/imprecise the message gets out of control. I would also avoid the word statistical significance because that implies that 0.05 is the threshold we should use.

 

MY ATTEMPT

 

A) Based on INSERT ONE OF THE FOLLOWING - 

1) the best available evidence (if it was a "well done" systematic review)

2) a "well done" meta-analysis of “x” studies

3) a "well done" RCT

 

B) Done in - Describe the population in one sentence

 

C) Taking/using INSERT ONE OF THE FOLLOWING

1) this medication

2) treatment

3) etc

 

Likely impacts 

 

D) Death, CVD or whatever endpoint - Let’s assume the CI is 0.80-1.05

 

somewhere between a 20% decrease or a 5% increase

 

E) Time period

Add in the time frame - 6 months, 5 years etc

 

F) Add side effects and cost if available

 

In other words basically stick to what was found. This could be done in 2-3 sentences for most things I think.

 

Hope this helps.

 

 

On Nov 13, 2013, at 1:12 PM, Brian Alper MD <[log in to unmask]> wrote:

 

James

 

What a great paper.  3 examples of the p = 0.049 vs. p = 0.051 problem with seriously different conclusions of life and death outcomes for 3 different common drugs/conditions.  I have been teaching and showing this concept for a long time (and sometimes it is less impactful with madeup examples) but this is 3 serious real-world examples that show it so clearly.

 

A definitive yes/no is not a correct conclusion for any of these examples and I agree with your conclusions in your paper that the better representations of the data is something like “drug may reduce mortality” and that could be used whichever side of the magic p = 0.05 the data falls.   When representing the confidence in this conclusion (quality of evidence) this could be downgraded using the GRADE approach (based on imprecision) or using other approaches acknowledging limited statistical confidence in our conclusion.

 

When analyzing thousands of studies and writing structured or standardized reports for ease of quick interpretation and clinical application with a large team of people you have to balance critical thinking/individualization with each study with standard approaches and expectations for process and phrasing.

 

I wonder which of these is the optimal balance of accuracy and readability for representing this concept in words (perhaps you have other ideas):

 

·        Drug may reduce mortality

o   Based on study with borderline statistical significance

 

·        Drug may reduce mortality

o   Based on study with imprecise results

 

·        Drug may reduce mortality

o   Based on study with confidence intervals including both clinically important and clinically unimportant differences

 

·        Drug may reduce mortality

o   Based on study with confidence intervals including clinically unimportant differences

 

·        Drug may reduce mortality

o   Based on study with confidence intervals including results of questionable clinical importance

 

 

Brian

 

Brian S. Alper, MD, MSPH, FAAFP

Founder of DynaMed
Vice President of EBM Research and Development, Quality & Standards

 

 

From: McCormack, James [mailto:[log in to unmask]
Sent: Tuesday, November 12, 2013 11:21 AM
To: Brian Alper MD
Cc: <[log in to unmask]
> (EBH)
Subject: Re: Interesting reading from Nature

 

Hi Brian - is it possible that there is no answer because the question of a study being able to give a yes or no answer is flawed in the first place?. As an example of how authors get their “knickers in a knot” about this we just published an article entitled "How confidence intervals become confusion intervals”. I think this shows how some authors try to give yes/no answers and sometimes go to extremes of number manipulation because of the “sacred p<0.05” just to get that yes/no answer. Studies (certainly single studies) can’t answer things yes and no in my opinion. Hope you find the article interesting.

 

 

On Nov 12, 2013, at 2:48 AM, Brian Alper MD <[log in to unmask]> wrote:

 

I don’t have a good answer (and have not found one when asking this question in the past) but I’ll try to ask my question here again in relatively simple terms because the implications are so important for making conclusions affecting clinical decisions from the science that we have in the form of published reports.   The question is how we determine the likelihood that a study result is true, and what threshold we use if presenting a yes/no approach to that determination.

 

For this discussion let’s assume many things:

 

·         1) Assume all other factors (methodologic quality, directness of factors evaluated, consistency across related outcomes or related studies, absence of publication bias, and so forth) withstand critical appraisal and we are focusing the question on the statistical test result reported and determining whether we conclude yes or no for whether an important difference is found in the study.  If yes let’s call that a positive study result.  If not let’s call that a negative study result.

·         2) Assume further that our threshold for a positive/negative study result (often a difference > 0 in conventional reports) and our threshold for a minimally important  difference are the same.  If not there are other issues that need to be considered for interpretation of study results.

·         3) Assume further that “study” in this context can be the outcome from a single study or the outcome form a meta-analysis.   I’m trying to focus the question on the interpretation of statistical significance, not which data to report.

 

With all those assumptions there is an inappropriate approach used across the medical community equating p < 0.05 = positive study result and p > 0.05 = negative study result.  This is applied so frequently without questioning the threshold to apply on a case-by-case basis that it is a norm leading to different conclusions when p = 0.049 and p = 0.051

 

If we consider the study as a test to determine the truth, we can understand this in the context of interpreting diagnostic tests.

 

·         Specificity = likelihood of negative study result if truth = no difference.  This is often represented by 1 – p.   A threshold of p < 0.05 means we set specificity at 95%.

·         Sensitivity = likelihood of positive study result if truth = true difference.   This is often represented by the power of a study to detect a difference. However the power calculation reported in a study as an a priori consideration is usually for a different threshold than the threshold being used for the p value.

·         Positive predictive value – If we could determine this it would be the likelihood the study result is true.  But in evaluating diagnostic tests you cannot determine the positive predictive value and negative predictive value without including the prevalence of the condition.  The p value approach does not provide the “likelihood of truth before the study” – this is something the Bayesian approach tries to account for.

·         Positive likelihood ratio -  You can report a likelihood ratio without knowing the prevalence, but only if you know both the sensitivity and specificity.  In the traditional approach the power reported does not use the same threshold as the p value so you can not simply use these 2 variables to predict likelihood ratios.   The statistics reported in the paper below are beyond me but perhaps this paper can be interpreted as Bayesian factors represent likelihood ratios for the study.

 

 

So what approach do we use on a large scale for interpretation of clinical research reports?

1.       Use the conventional p < 0.05 because we do not have a practical alternative.

2.       Use the 95% confidence interval – this is commonly suggested as an alternative to the p value. This approach is preferred because (1) it is more understandable because it uses measures of effect size that we can relate to more than the abstract p value and (2) it allows us to see when assumption #2 above is violated (helping us recognize when the threshold for p value reporting and the threshold for a minimally important difference are not the same).  But for a simple yes/no interpretation with the assumptions above the edge of the 95% confidence interval is essentially the p value and we are still faced with the arbitrariness of p < 0.05

3.       Use a smaller p value for our yes/no cutoff.  We would need to agree to a different cutoff and be consistent across many people and efforts to eventually change our current approach.  This is moving the threshold without changing the underlying approach to defining the threshold.

4.       Use Bayesian factors derived from the data.  I suspect this can not be done from study reports and would require statistical efforts with the original data.

5.       Use Bayesian factors and estimates of pre-study probabilities.  That’s a lot to expect.

6.       Do not report results with a yes/no framework.   But there is such an expectation for it.

 

In practical terms we are considering a critical appraisal criterion (what does it take to get level 1 evidence/high-quality evidence, assuming all other criteria are met) of “Confidence intervals do not include both presence and absence of clinically meaningful differences”  This is like approach #2 above.  It does not seem conservative enough (risk of nonreproducibility as reported in the paper below) but setting a different threshold (approach #3) or using different statistical approaches would require more input and buyin from many across the critical appraisal community AND practical methods for application and explanation to the larger clinical community.

 

Your thoughts are appreciated.

 

Brian S. Alper, MD, MSPH, FAAFP

Founder of DynaMed
Vice President of EBM Research and Development, Quality & Standards

 

From: Evidence based health (EBH) [mailto:[log in to unmask]On Behalf Of Andrzej Glowinski
Sent: Tuesday, November 12, 2013 3:31 AM
To: [log in to unmask]

Subject: Re: Interesting reading from Nature

 

Unfortunately the link given in the article to the original paper is wrong so here is one that works:-

 

 

I'll admit I find the details of the stats hard to follow these days so I'll leave it to the professional statisticians.

 

However, if I remember correctly (from a decision making perspective), Bayesian calculations require the prior probability of a state given a finding (vs. the probability of a finding in a given state) e.g. the incidence of bronchial carcinoma in a population who have a cough vs. the incidence of cough in a population with bronchial carcinoma.  The latter kind of data are usually more feasible to gather and/or estimate reliably, and this value has been substituted in practice in the past.  

 

I may be getting this completely wrong as the paper covers a different area to the one I used to work in, but I wonder if like is being compared to like here -- my recollection is that, in practice, you need more/different data to use Bayes well, and if these are not available then the certainty attributable to Bayes reduces.

 

Andrzej Glowinski

 

 

On 12 Nov 2013, at 00:50, Carlos A. Cuello Garcia wrote:

 

For your daily reading, might be of interest

 

 

--

Carlos A. Cuello-García, MD, PhD(c)

HRM·Clinical Epidemiology and Biostatistics

McMaster University Health Sciences Centre Room 2C14

1280 Main Street West

Hamilton, ON  L8S 4K1 Canada

+1 (905) 525-914 x22332

Skype: dr.carlos.cuello