Print

Print


Jean-Michel,

I have been thinking about your question for some time.

If the data are log-normally distributed, and you wish to give some summary
statistics for the distribution, I would usually prefer to quote the median
and quartiles to give a summary of the shape of the distribution.
If the analysis e.g. comparison of treatments was based on comparing the
means of the log-transformed values then I will quote the geometric mean
rather than the median.

There are circumstances, however, where you need to estimate the arithmetic
mean.  In this case, what is the best estimator of the mean of a log-normal
distribution if you have a sample of n values?
Suppose y ~ N(mu, sigma**2) and z = exp(y).

I do not know what is the 'best' but two obvious estimators are:

(i) the arithmetic mean of the sample:  Sum(z)/n
and
(ii) the maximum likelihood estimate exp(muhat + 0.5*sigmahat**2)
where muhat and sigmahat are the maximum likelihood estimates of the mean and
std of the log-transformed values.
muhat = Sum(y)/n
sigmahat**2 = Sum((y-muhat)**2)/n

Which of these is best?

I have found a formula for the mean-square-error of these two estimators.
The ml estimator has smaller MSE for sufficiently large n
The MSE of the ll estimator is infinite if n < 2*sigma**2

The ml estimator is slightly biassed for finite n.
If the usual 'unbiassed' estimate of sigma**2 (ie Sum((y-muhat)**2)/(n-1) is
used then the bias is worse.

This suggests some more theoretical questions that the allstatters may be
able to shed some light on:
(i) Is there a 'best' estimator for the arithmetic mean
(ii) Are maximum likelihood estimators always atleast as good as other
estimators for suffciently large n
(iii) How large does n have to be?

Best wishes

Tim Auton

--
T R Auton PhD MSc C.Math
Head of Biomedical Statistics
Proteus Molecular Design Ltd
Beechfield House
Lyme Green Business Park
Macclesfield
Cheshire SK11 0JL
UK
email: [log in to unmask]

Jean-Michel Lemieux wrote:

> Hello,
>         I have a log normal distributin and i would like to know which is
> the best estimator of the mean. Is it the arithmetic mean (i don't think
> so), the median or the mean when the datas are transform in log?
>
> Thank you very much
>
> Jean-Michel
> __________________________________________________________________
> Jean-Michel Lemieux
> [log in to unmask]
> *       *       *       *       *       *
> Département de géologie et génie géologique
> Universite Laval
> Québec, Canada
> G1K 7P4







%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%