RSS Merseyside Local Group Meeting – Machine Learning
The next RSS Merseyside Local Group meeting will take place on Wednesday 17th October at 14.00pm in Room 107, Brodie Tower, University of Liverpool Campus (building 233, gridsquare C8, see https://www.liverpool.ac.uk/media/livacuk/english/liverpool-university-campus-map.pdf).
We welcome Prof. Chris Williams from the University of Edinburgh, and Juhi Gupta and Prof. Simon Maskell from the University of Liverpool to talk about Machine Learning.
14.00-14.45 – Professor Chris Williams (University of Edinburgh): "Machine Learning, and Vision as Inverse Graphics"
I will start with a brief discussion of types or machine learning (ML) problems, and similarities & differences between ML and statistics. I will then discuss our work on Vision-as-Inverse-Graphics: Obtaining a Rich 3D Explanation of a Scene from a Single Image (Romaszko, Williams, Moreno and Kohli, 2017). This includes the formulation of a vision-as-inverse graphics problem, the use of neural network models to predict the object and global scene variables, and the accuracy of the resulting predictions. If there is time I will also describe our work on a Hierarchical Switching Linear Dynamical System Applied to the Detection of Sepsis in Neonatal Condition Monitoring (Stanculescu, Williams and Freer, UAI 2014).
14.45-15.15 – Juhi Gupta (University of Liverpool): "Using Machine Learning techniques in Preterm Birth Prediction"
Introduction: Preterm birth (PTB), birth >37 weeks, is a multi-factorial condition and one of the biggest causes of infant mortality. Surviving infants are often affected by severe conditions such as motor impairment or neurological disorders. This study aims to identify early biomarkers for prediction of Spontaneous preterm birth (sPTB) and Preterm prelabour rupture of the membranes (PPROM, breaking of the amniotic membrane) using machine learning techniques. Methods: Pregnant patients were recruited at 16 and 20 weeks of gestation at the PTB Prevention clinic, Liverpool Women’s hospital. RNA samples were run on the Clariom D Array (ThermoFisher). Patients were categorised into phenotypes: sPTB, PPROM and term deliveries (>37 weeks).
After QC of the array data, Random Forest was carried out with 10,000 trees being generated. For gene-set enrichment analysis, 173 genes were uploaded onto Functional Mapping and Annotation (FUMA) for pathway identification. Randomised Dependence Coefficient (RDC) algorithm was implemented to generate hierarchical clustering of the identified transcripts. Results: Enrichment analysis of the genes identified the selenoamino acid metabolism pathway with 3 significant genes (CTH, LCMT1, TRMT11) as differentially expressed across the different clinical groups, p = 1.5e-3 (adjusting for multiple testing). Hierarchical clustering also showed the grouping between the different clinical groups. Discussion: Selenium may play a role in initiating early labour. Studies have previously reported a role of selenium as a genetic cause of PTB. Further studies will integrate these findings with other omics data, including genomics and metabolomics. These methods could be applied to other complex disorders.
15.15-15.30 – Refreshments
15.30-16.15 – Professor Simon Maskell (University of Liverpool): "Big Hypotheses: a generic tool for fast Bayesian Machine Learning"
There are many machine tasks that would ideally involve global optimisation across some parameter space. Statisticians often pose such problems in terms of sampling from the distribution and favour Markov Chain Monte Carlo (MCMC) or its derivatives (e.g., Gibbs sampling, Hamiltonian Monte Carlo (HMC) and Simulated annealing). While these techniques can offer good results, they are slow. We describe an alternative numerical Bayesian algorithm, the Sequential Monte Carlo (SMC) sampler. SMC samplers are closely related to particle filters and are reminiscent of genetic algorithms. More specifically, an SMC sampler replaces the single Markov chain considered by MCMC with a population of samples. The inherent parallelism present makes the SMC sampler a promising starting point for developing a scalable Bayesian global optimiser, e.g., that runs 86,400 times faster than MCMC and might be able to be 86,400 times more computationally efficient. The University of Liverpool and STFC’s Hartree centre have recently started working on a £2.5M EPSRC-funded project (with significant support from IBM, NVidia, Intel and Atos) to develop SMC samplers into a general purpose scalable numerical Bayesian optimisation and embody them as a back-end in the software package Stan. This talk will summarise recent developments, initial results (in a subset of problems posed by Astrazeneca, AWE, Dstl, Unilever, physicists, chemists, biologists and psychologists) and planned work over the next 5 years towards developing a high-performance parallel Bayesian inference implementation that can be used for a wide range of problems relevant to statisticians working in a range of application domains.
The event is open to all, but as ever, please register in advance so that we can organise refreshments. For more information, and for the link to register, see https://sites.google.com/site/rssmerseyside/research-meetings/machine-learning
Many thanks
Maria Sudell
Secretary, RSS Merseyside Local Group
You may leave the list at any time by sending the command
SIGNOFF allstat
to [log in to unmask], leaving the subject line blank.
|