UNIVERSITY OF ST ANDREWS
Statistics Seminars - Candlemas Semester (Term 3) 1999
____________________________________________________________
MONDAY 26 APRIL at 4 p.m.
Professor Frank BALL (University of Nottingham)
"MCMC for hidden continuous time Markov chains"
MONDAY 10 MAY at 4 p.m.
Mr Ian MATÉ (General Register Office, Edinburgh),
"The One Number Census: practical survey and theoretical problems"
WEDNESDAY 12 MAY at 3.15 p.m.
Dr Allan GORDON (University of St Andrews)
"Data mining: new name for old activity"
WEDNESDAY 12 MAY at 4.45 p.m.
Professor Fionn MURTAGH (Queen's University of Belfast)
"Constant-Time Clustering for High-Dimensional Data"
____________________________________________________________
All the seminars will be held in Lecture Theatre B of the Mathematical
Institute. Tea will be available from 3.40 p.m. on April 26 and May 10,
and between the talks on May 12. Visitors will be very welcome.
Further information from:
Dr I B J Goudie email: [log in to unmask]
____________________________________________________________
THE MEETING ON 12 MAY IS A JOINT MEETING WITH THE HIGHLANDS GROUP OF THE
ROYAL STATISTICAL SOCIETY.
The meeting will be followed by a meal. The arrangements for the meal are
available from Ian Goudie (e-mail address as above), whom you should notify
by Friday 7 May if you wish to attend.
____________________________________________________________
SEMINAR ABSTRACTS
Professor Frank BALL (University of Nottingham)*
"MCMC for hidden continuous time Markov chains"
Hidden Markov models have proved to be a very flexible class of
models, with many and diverse applications. Recently Markov chain Monte
Carlo (MCMC) techniques have provided powerful computational tools to make
inferences about the parameters of hidden Markov models, and about the
unobserved Markov chain, when the chain is in discrete time. In this talk,
I present a general algorithm, based on reversible jump MCMC for inference
in hidden Markov models where the unobserved chain runs in continuous time.
The method is illustrated with two examples. One is a relatively simple
application to Markov modulated Poisson processes. The second is a more
complex problem of inference from single ion channel data, and serves to
demonstrate the power and flexibility of the algorithm.
* Based on joint work with Tony O'Hagan, Yuzhi Cai and Jay Kadane.
Mr Ian MATÉ (General Register Office, Edinburgh),
"The One Number Census practical survey and theoretical problems"
In 1991, it was estimated that the UK Census missed about 1.2
Million people. Of particular concern was the fact that undercoverage was
not evenly distributed - in some inner city areas it has been estimated
that up to 20% of the young adult male population was missed. A follow-up
survey was designed to measure the quality of Census answers and
undercoverage. However it did not find the missing million. Therefore a
specific coverage survey is planned for 2001 to measure undercoverage and
will involve the re-enumeration of everybody in about 2,500 postcodes in
Scotland. The results from the survey will be used to impute missing people
into households and impute whole households and place these missed people
into specific postcodes. These people and households will be added to the
Census database to produce a One Number Census (ONC).
The seminar will look at the problems of sampling, survey
practicalities, and follow the path from data capture to imputing missing
people in postcodes not surveyed.
Dr Allan GORDON (University of St Andrews)
"Data mining: new name for old activity"
With the growth in the size of data sets that are recorded and
stored electronically, there is a need to extract and summarize information
from large and complex multivariate data sets. Various names have been used
to describe a miscellaneous collection of procedures that address this
problem, a recent one being 'data mining'. Similar concerns have motivated
earlier work in 'classification', where this term has been taken to refer
both to the construction of classes of similar 'objects' and to the
assignment of objects to existing classes. Some relevant methodology is
described.
Professor Fionn MURTAGH (Queen's University of Belfast)**
"Constant-Time Clustering for High-Dimensional Data"
We extend recent results on constant-time clustering algorithms to
a new problem area, that of clustering data in high-dimensional data
spaces. We overcome the "curse of dimensionality" in such problems by (i)
using some canonical ordering of observation and variable (document and
term) dimensions in our data, (ii) applying a wavelet transform to such
canonically ordered data, (iii) modeling the noise in wavelet space, (iv)
defining significant component parts of the data as opposed to
insignificant or noisy component parts, and (v) reading off the resultant
clusters. The overall complexity of this innovative approach is linear in
the data dimensionality. We describe a number of examples and test cases,
including the clustering of high-dimensional hypertext data.
** Based on joint work with Jean-Luc Starck and Michael W. Berry
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|