Applications are invited for a studentship for an EngD in
Genetic Programming Evolutionary Data Visualisation Tools in
association with GlaxoSmithKline (Harlow) concerned with the analysis
and visualisation of the evolution of genetic programs and genetic
programming populations. Applicants must have at least an upper
second class degree in Computer Science or a related discipline,
good programming and data visualisation skills and an enthusiasm
for evolutionary computing.
For further details please contact Dr. W. B. langdon.
To apply send your CV to Dr. W. B. langdon or
email a PDF or postscript file to [log in to unmask]
W. B. Langdon, Phone +44 20 7679 4436
Computer Science, Fax +44 20 7387 1397
University College, London,
Gower Street,
London, WC1E 6BT, UK
http://www.cs.ucl.ac.uk/staff/W.Langdon
EuroGP Ireland 3-5 April 2002 http://evonet.dcs.napier.ac.uk/eurogp2002
GECCO New York 9-13 July 2002 http://www.isgec.org/GECCO-2002
GP+EM Journal http://www.wkap.nl/journalhome.htm/1389-2576
- - - ----------------------------------------------------------------------
Summary of Project
Visualisation tools are needed to supplement formal analysis of
genetic programming, to aid the understanding of evolutionary
techniques in particular and data analysis in general. With a view to
building better algorithms, aiding data analysis and easing the
introduction of humans "in the loop" to interactively guide
GP. Allowing principled "steering" of evolutionary algorithms.
The project will demonstrate tools to study the object being evolved.
In the case of genetic programming this is the formulae or data model.
Typically in GP this is represented as a tree. Therefore existing graphical
tools to display trees will be enhanced to display
0. The syntax of the model
1. The operational semantics of the model
Including "fly through" the model with different compounds being
modelled using existing virtual reality tools eg based upon VRML.
Various flight paths will be offered, including following the data flow
through the model for a) a user specified chemical, b) average
behaviours across the training (or other) dataset.
In some applications it may be advantageous to also be able to display
c) the variability of the model across the dataset.
The use of stacking the model on top of itself, using colour,
intensity or depth clues will be investigated for iterative models and
for display of data flow across datasets.
2. Experiments will be made using different mechanisms to highlight
bottlenecks and regions of good or poor performance. Eg using colour,
size or texture of various icons. While sound offers another
dimension. The goal being, not only tools, but by concentrating upon
the human factors involved early on, we will ensure the tools will be
readily used by non-experts.
3. These represent investigations of data and models at one instance
during the evolutionary algorithm (ie at one instant in time). They
and additional mechanisms will be used as aids to the analysis of the
evolutionary processes (ie improvement of the model) leading to
improved models. Of particular interest are the mechanisms leading to
the evolution of improvements over fitting and redundancy within
models. Such an analysis tool may lead to better understanding and so
to improved evolutionary methods.
4. A common feature of evolutionary techniques is their use of a
population of models. Various untested assumptions about GA/GP
populations are common. Our techniques to display variation of model
behaviour across datasets and across time will be readily adapted to
display variation (and similarity) across the population. This will
cover the size of the models, their syntax and operational semantics,
degree of population convergence and variability. Tracking and
displaying genetic inheritance, particularly following the inheritance
of genes, their fitness, and their aggregations and disassociations by
genetic operations (crossover and mutation). This will be related to
modern schema fitness and effective fitness metrics [Langdon Poli, 2001],
genic variance and covariance [Price,1970] and testing
assumption of population variability and convergence.
|