I think Bayesian interpretation, namely that there is 95% probability that the true value is within then 95% confidence range, is adequate for most purposes if one wants to be pragmatic and don't care too much about strict mathematical correctness.

One can come up with some crazy examples where it is strongly misleading. For example, if we observe a single child born in the village and notice that it is a boy, few people would (hopefully) buy the inference that there is 95% chance that the sex ratio is no more than 20/1 in favour of girls. But in practice it is ok for most purposes. 

I think that the interpretation of p-value is more of a hot potatoe. You don't want to teach the students to commit the procecutor's fallacy, i.e. thinking that a p-value of 0.03 for the hypothesis that a person is innocent means that we can be 97% sure that he is guilty! So there is a danger in presenting frequentist inference as if it were Bayesian.

You could say that we have 95% "confidence" in order to avoid the word "probability", but I am not sure if it is helpful. It is basically saying the same but just in a more sluggish way in order to avoid being accused of saying something inaccurate. Better is, IMHO, to give the Bayesian interpretation, but say that it is only strictly correct under certain conditions, in particular that we believed a priori that all possible values of theta were equally likely.

 

On Sat, Oct 31, 2015 at 4:28 AM, Daniel Molinari <[log in to unmask]> wrote:
Fred,

I think that the way to report the results highly depends on the students.  Are they up to understand what are the frequentist/Bayesian approaches to inference ?  Do they understand what a random variable is ?   I suspect you're dealing with students who do not major in maths or statistics.

If this is the case, I dare say to word the reporting as: (-5.153, -0.801) is "a" 90% CI for E(X)-E(Y), where
E(X) = Mean of absorption capacity of store brand paper towels
E(Y) = Mean of absorption capacity of name brand paper towels

implying that there is a significant difference at the 90% level in the absorption capacity of the towels i.e. store brand paper towels seem to have less absorption capacity than name brand paper towels.

In your exposition of the topic you should underline that "a 90% CI" stands for "if we were to repeat the experiment several times, we should expect to find the true value of the difference of absorption capacity is between -5.153 and -0.801 roughly in 90% of the repetitions", but of course you wouldn't ask your students to include such an statement in every exercise.

All the best,
Daniel

On Fri, Oct 30, 2015 at 9:52 PM, Fred Weigel <[log in to unmask]> wrote:

Daniel,

 

I appreciate your response, thank you.

 

Based on the frequentist approach, how should I “report” the results?  In other words, what do I tell the students to report after they’ve done their calculations?

 

For example, assume you have the following CI values based on your comparison of absorption capacity of store brand to name paper towels (a CI of the difference between population means: xbar(store) – xbar (name brand)):

Upper limit:  -0.801

Lower limit:  -5.153

 

What I’ve taught is that the interpretation is as follows (but I’ve never been completely comfortable with it):

Interpretation:

With 90% confidence, we estimate that the mean difference liquid absorption capacity between store brand paper towels and name brand paper towels is between -5.153 and -0.801 milliliters

 

What would be a better way to report the interpretation?  This goes back to John’s original request.

 

Thank you,

Fred

 

From: Daniel Molinari [mailto:[log in to unmask]]
Sent: Friday, October 30, 2015 6:06 PM
To: Fred Weigel
Cc: [log in to unmask]
Subject: Re: 95% confidence interval: best description.

 

Hi Fred,

 

If you have the whole population (which may be feasible only if it is finite), then you do not need even to sample:  you can compute the parameter straightforward.

 

The meaning of CIs under the frequentist approach is that if you could pick up ALL the possible samples of a given size from the corresponding population (again, only possible if it is finite) then 95% of them exactly would contain the true value of the parameter.

 

But because the whole population is generally unavailable, the interpretation is that if you pick up a LARGE number of samples (always with the same probability for each sample) IN THE LONG RUN, roughly 95% of them will contain the fixed but unknown true value of the parameter.

 

In short, you're replacing the potentially infinite number of samples that you might observe with a large number of them.

 

A particular example I always use is that of political election:  before the election itself, consulting companies try to predict the percentage of votes a candidate will receive on the basis of sampling.  In this case, the prediction is affected by a certain margin of error because it is based on a subset of the whole population.  Obviously in order to have a good prediction, you need a sample large enough which is not biased.

 

Only when the census of the whole population will be carried out, i.e. after the election, you will get the EXACT value of the parameter.

 

Hope it helps,

 

All the  best,

Daniel

 

On Fri, Oct 30, 2015 at 7:20 PM, Fred Weigel <[log in to unmask]> wrote:

When stating that “95 out of 100 times…,” wouldn’t the only way we could be completely accurate is to include all possible combinations of sample size X?

 

Although I use a shortened version for regular use—similar to “with 95% confidence, we believe the parameter is between LL and UL”—when I explain the theory to students initially (and repeatedly), I explain it the following way:

-          If we collected every possible combination of sample size X and created a confidence for each sample, 95% of the time, the population parameter would be between the lower and upper limits.  The other 5% of the time, the confidence interval would not capture the population parameter.

 

Obviously, if we had every possible combination of sample size X, we would have the population data and wouldn’t need to do any inferential statistics.  But, for teaching purposes, I think the way I state it is accurate.

 

What do you think?

 

Fred

 

From: A UK-based worldwide e-mail broadcast system mailing list [mailto:[log in to unmask]] On Behalf Of Nicola Novielli
Sent: Friday, October 30, 2015 12:39 PM
To: [log in to unmask]
Subject: Re: [allstat] 95% confidence interval: best description.

 

I mainly agree with Martin. However, i think both definitions are incomplete and miss some points. 

 

I ask whether the definition should describe or interpret a confidence interval: the first case looks like an extremely simplified attempt to describe a CI (regardless whether using frequentist, classic or whatsoever theoretical franework); the second looks like an attempt to give an interpretation to a CI. 

 

In both cases some points are missing:

- the parameter is a sample mean ( not a population mean) ?

- don't forget the role of the power (calculation of CI if functional to hypothesis thesting, that is a diagnosis with 2 types of errors) ?

- what about the distribution of the population ? And the sample size?

... 

 

This is to say that attempts to simplify complex things may produce misleading (or incomplete) definitions. On the other side even complex stuff must be shared with professionals from other disciplines (who may be more interested in interpretation than definition).

 

I guess whether it is more interesting to know that "Pinocchio is a wood-carved puppet", or listen to some of his adventures.

 

Hope I didn't go off topic! 

Have a nice w.e.

Nicola Novielli

 


Sent from my iPhone


On 30 Oct 2015, at 16:39, Martin Bland <[log in to unmask]> wrote:

I don't like 1) because it simply repeats the word confidence without explaining it.  I don't like 2) because if the experiment were carried out 100 times we might get 95 intervals which include the population value, we might get 94, 96, etc.  

 

I would say that a 95% confidence interval is a range of possible values which we estimate to contain the required quantity, calculated so that if were to repeat the sampling many times, 95% of intervals thus calculated would include the required quantity.  That is strictly a frequentist view, but non-frequentists calculate credible intervals instead and we should keep the two things clear.

 

Martin

 

 

On 30 October 2015 at 12:29, John Sorkin <[log in to unmask]> wrote:

I would appreciate thoughts about the following two descriptions of a 95% CI:

Call

Send SMS

Call from mobile

Add to Skype

You'll need Skype CreditFree via Skype

 

For a given parameter X, a 95% CI round X is:

1) A range of values which we can say with 95% confidence contains the true value of the parameter.

2) A range of values constructed such that if an experiment is conducted 100 times, 95% of the time X will lie with the range.

 

I would welcome comments of the above descriptions, and any better descriptions that you might have.

 

Thank you,

John

 

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

 

Confidentiality Statement:

This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.



 

--

***************************************************
J. Martin Bland
Prof. of Health Statistics Emeritus
Dept. of Health Sciences
Seebohm Rowntree Building
University of York
Heslington
York YO10 5DD

Email: [log in to unmask]
Phone: 01904 321334     Fax: 01904 321382
Web site: http://martinbland.co.uk/

Statement by the University of York:
This email and its attachments may be confidential and are intended solely for the use of the intended recipient. If you are not the intended recipient of this email and its attachments, you must take no action based upon them, nor must you copy or show them to anyone. Please contact the sender if you believe you have received this email in error. Any views or opinions expressed are solely those of the author and do not necessarily represent those of The University of York.
***************************************************

You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.


Nota di riservatezza : Il presente messaggio, corredato di eventuale/i allegato/i, contiene informazioni da considerarsi strettamente riservate ed è indirizzato esclusivamente al/i destinatario/i sopra indicato/i unico/i autorizzato/i ad usarlo, copiarlo e, sotto la propria responsabilità, eventualmente diffonderlo. Chiunque ricevesse questo messaggio per errore o comunque lo leggesse senza esserne legittimato è avvertito che trattenerlo,copiarlo, divulgarlo, distribuirlo a persone diverse dal destinatario è
severamente proibito ed è pregato di rinviarlo immediatamente al mittente distruggendo l'originale.
Federazione delle Banche di Credito Cooperativo di Puglia e Basilicata Centralino 080 2205211 Fax 080 2205214
You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.

You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.

You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.

 


You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.


You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.