UNIVERSITY OF GLASGOW
STATISTICS SEMINAR PROGRAMME
Wednesday, 20th January, 3pm
Bayesian analysis of variance with finite mixtures
Agostino NOBILE (University of Glasgow)
Wednesday, 10th February, 3pm
Fitting intractable generative models
Geoffrey HINTON (University College London)
Wednesday, 3rd March, 3 pm
Probabilistic methods in search design
Anatoly ZHIGLJAVSKY (Cardiff University)
Wednesday, 17th March, 3 pm
A Robust Normalizing Transformation for Interrater Reliability
with Applications regarding the Diambiguation of Literary Texts
by Humans
Kevin J. KEEN (University of Manitoba, Canada)
Seminars take place in Room 1f(203), Mathematics Building,
University of Glasgow
For further information please contact the seminar organiser:
Ilya Molchanov
University of Glasgow : e-mail: [log in to unmask]
Department of Statistics : Ph.: + 44 141 330 5141
Glasgow G12 8QW : Fax: + 44 141 330 4814
Scotland, U.K. : http://www.stats.gla.ac.uk/~ilya/
ABSTRACTS
BAYESIAN ANALYSIS OF VARIANCE WITH FINITE MIXTURES
This talk will describe an approach to Bayesian analysis of variance
which uses finite mixture distributions to model the main effects and
interactions. This allows both estimation and an analogue of hypothesis
testing in a posterior analysis using a single prior specification.
A detailed formulation will be provided for the case of the two-way model
with replications, allowing interactions.
Issues in the formulation of the prior distribution will be examined and
some examples will illustrate implementation, presentation of posterior
distributions, sensitivity, and performance of the MCMC methods used.
FITTING INTRACTABLE GENERATIVE MODELS
It is relatively easy to fit a generative model to observed data when
the model is so simple that it is easy to take into consideration all
the possible explanations of each observed data vector. Factor
analysis, linear dynamical systems, mixtures of Gaussians, and Hidden
Markov Models are all examples of tractable generative models.
Unfortunately, the models that are biologically relevant have multiple
simultaneously active non-linear hidden neurons, and for models of
this type it is intractable to compute the posterior probability
distribution over all possible explanations when given an observed
data vector. As a result it is impossible to adjust the connection
strengths in a way that is guaranteed to increase the likelihood of
generating the observed data.
Fortunately, there is a more subtle objective function for learning
and this new objective function can be improved even when the true
posterior distribution over explanations cannot be computed. The new
objective function is the log likelihood of generating the observed
data penalized by the divergence between the true posterior
distribution over explanations and a tractable approximation to this
distribution. Although both terms in this objective function are
intractable their difference is tractable.
PROBABILISTIC METHODS IN SEARCH DESIGN
A general search problem can be thought of as a problem of
finding/approximating a target, that is a functional of unknown
parameters, on the base of certain error-free observations at a number
of design points. The design problem is in a clever selection of the
test/observation points.
On a number of examples we demonstrate the usefullness of the
probabilistic methodology in (i) construction of efficient designs and
(ii) establishing existence of optimal designs. We often concentrate
on the methodology based on computing entropies of the partitions
generated by the search designs. We argue that the use of certain
Renyi entropies may be preferable to the use of the famous Shannon
entropy.
The first part of the talk deals with nonsequential designs. We
outline the work of the probabilistic method by establishing upper
bounds for the length of optimal combinatorial group testing designs.
We also consider the problem of optimal design for estimation of
integrals in certain parametric classes of function. The second part
of the talk is devoted to sequential algorithms. It is shown that many
converging deterministic algorithms exhibit a chaotic behaviour after
a suitable renormalization is applied at every iteration. The
probabilistic criteria can then be used to characterise the rate of
convergence and other asymptotic properties of the algorithms.
The major part of the talk is based on the results of the joint work
with H.P.Wynn and L.Pronzato.
A ROBUST NORMALIZING TRANSFORMATION FOR INTERRATER RELIABILITY WITH
APPLICATIONS REGARDING THE DIAMBIGUATION OF LITERARY TEXTS BY HUMANS
A robust variance-stabilizing transformation of the intraclass
correlation estimator of interrater reliability is derived in the case
of either continuous or dichotomous measures. The mathematical
relationship between this variance-stabilizing transformation and
Cronbach's alpha is established. Monte Carlo simulations are reported
that show this transformation also to be normalizing in small samples
for either continuous or dichotomous measures. The proposed
methodology is illustrated in the context of disambiguation of
literary texts. Computer string searches can produce lists of words
potentially related to a chosen literary theme, or so-called semantic
field. But the fact that words have multiple significations implies
that there is a difference between the potential and real allusions.
The only practical approach is the disambiguation of allusion by human
informants. But one can be legitimately concerned about the
reliability of such a process. The semantic field of solitude or
loneliness is important from both sociological and psychological
perspectives. Nine French novels from the twentieth century written
in the first person have been chosen for an analysis of the
reliability of disambiguation of allusions to human solitude.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|