DIVISION OF STATISTICS AND OPERATIONAL RESEARCH
DEPARTMENT OF MATHEMATICAL SCIENCES
UNIVERSITY OF LIVERPOOL
All are very welcome to attend. Please contact Paula Williamson at this address
for further information.
********************************************************************************
Date: Wednesday 22 March
Time: 2 pm (tea 3-3:30)
Venue: Penthouse, Top Floor, Mathematics and Oceanography Building
Speakers: Murray Aitkin and Irit Aitkin,
Department of Statistics, University of Newcastle UK
Title: Nonparametric maximum likelihood for randomly missing data in regression
Abstract:
Methods for dealing with randomly missing data on explanatory variables in
regression models are now well established. The restriction of analysis
to complete cases is well-known to be inefficient and may be biased as
well. Schafer (1997) gives an up-to-date discussion.
There are two general non-Bayesian approaches: maximum likelihood fitting
assuming a specific distribution for all the variables, response and
explanatory, and multiple imputation to produce multiple "completed" data
sets which are analysed separately, and then combined to give efficient
parameter estimates and correct standard errors. MC^2 methods are also
possible in a Bayesian approach.
While these methods are well established in theory, they are so far little
used in practice; they present considerable difficulties, both
computational and conceptual.
In this talk we present first a large data set on the continuation of
breast feeding from a study at Curtin University School of Public Health.
Complete case analysis requires the omission of a smoking status variable
which is missing on many of the participants. Maximum likelihood
imputation allows the inclusion of this variable, which has a substantial
effect on the conclusions.
Difficulties with the imputation approach become very severe if there are
many categorical covariates. It is sometimes suggested that these be
assumed normal as an approximation; MVN imputation is much faster.
An alternative to this parametric modelling approach is nonparametric
modelling of the covariate distribution. We describe the EM algorithm for
this approach and compare its performance with complete case and
parametric EM imputation in some simulated one- and two-variable
regression models.
Reference: J.L. Schafer (1997) Analysis of Incomplete Multivariate Data,
Chapman and Hall.
********************************************************************************
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|