Statistics, Applied Probability & Operational Research Seminars
Programme: Hilary Term 2012
Venue: Lecture room, Department of Statistics, 1 South Parks Road, Oxford OX1 3TG - Thursdays, 2.15 pm This will be followed by the Graduate Lecture series starting at 3:45 p.m.
Tea, coffee and biscuits are served after the seminar in the Statistics Common Room.
For more information on the Department seminar series please contact Professor Arnaud Doucet <mailto:[log in to unmask]>
ALL WELCOME
________________________________
Week 1 - January 19
Speaker: Professor Jean-Philippe Vert (Mines ParisTech and Institut Curie)<http://cbio.ensmp.fr/~jvert/>
Title: Group lasso for genomic data
Abstract: The group lasso is an extension of the popular lasso regression method which allows to select predefined groups of features jointly in the context of regression or supervised classification. I will discuss two extensions of the group lasso, motivated by applications in genomic data analysis. First, I will present a new fast method for multiple change-point detection in multidimensional signals, which boils down to a group Lasso regression problem and allows to detect frequent breakpoint location in DNA copy number profiles with millions of probes. Second, I will discuss the latent group lasso, an extension of the group lasso when groups can overlap, which enjoys interesting consistency properties and can be helpful for structured feature selection in high-dimensional gene expression data analysis for cancer prognosis. (Joint work with Kevin Bleakley, Guillaume Obozinski and Laurent Jacob)
Week 2 - January 26th
Speaker: Dr. Richard Wilkinson (University Nottingham)
<http://www.nottingham.ac.uk/mathematics/people/r.d.wilkinson>Title: An alternative approach to approximate Bayesian computation (ABC): what and why.
Abstract: One of the most pressing challenges for statistics, is the question of how to perform statistical inference when your model is a complex simulator which can be simulated from, but for which inference is mathematically intractable. Key challenges include estimating unknown parameter values from data, and dealing with the fact that your simulator inevitably contains errors (all models are wrong etc).
Approximate Bayesian computation (ABC) methods have been proposed to deal with the first problem, particularly when the simulator is cheap to evaluate. In this talk I will describe how we can modify ABC algorithms to accommodate beliefs about the second problem and argue that this view point gives a more appropriate way of thinking about ABC algorithms, and can allow us to interpret the results from ABC. I shall also draw links to the recent RSS read paper by Fearnhead and Prangle
Week 3 - February. 2nd
Speaker: Dr. Yee Whye Teh (UCL)<http://www.gatsby.ucl.ac.uk/~ywteh/>
Title: Hierarchical Bayesian Models of Sequential Data
Abstract: In this talk I will present a novel approach to modelling sequence data called the sequence memoizer. As opposed to most other sequence models, our model does not make any Markovian assumptions. Instead, we use a hierarchical Bayesian approach which allows effective sharing of statistical strength across the different parts of the model. To make computations with the model efficient, and to better model the power-law statistics often observed in sequence data arising from data-driven linguistics applications, we use a Bayesian nonparametric prior called the Pitman-Yor process as building blocks in the hierarchical model. We show state-of-the-art results on language modelling and text compression.
This is joint work with Frank Wood, Jan Gasthaus, Cedric Archambeau and Lancelot James, and is based on work most recently reported in the Communications of the ACM (Feb 2011 issue)
Week 4 - February 9th: seminar cancelled
Week 5 - February 16th
Speaker: Dr. Arthur Gretton (UCL)
<http://www.gatsby.ucl.ac.uk/~gretton/>Title: Hypothesis Testing and Bayesian Inference: New Applications of Kernel Methods
Abstract: In the early days of kernel machines research, the "kernel trick" was considered a useful way of constructing nonlinear learning algorithms from linear ones, by applying the linear algorithms to feature space mappings of the original data. More recently, it has become clear that a potentially more far reaching use of kernels is as a linear way of dealing with higher order statistics, by mapping probabilities to a suitable reproducing kernel Hilbert space (i.e., the feature space is an RKHS).
I will describe how probabilities can be mapped to kernel feature spaces, and how to compute distances between these mappings. A measure of strength of dependence between two random variables follows naturally from this distance. Applications that make use of kernel probability embeddings include:
* Nonparametric two-sample testing and independence testing in complex (high dimensional) domains. In the latter case, we test whether text in English is translated from the French, as opposed to being random extracts on the same topic.
* Inference on graphical models, in cases where the variable interactions are modeled nonparametrically (i.e., when parametric models are impractical or unknown).
Week 6 - February 23rd
Speaker: Professor Christian Robert (Universite Paris Dauphine & ENSAE) <http://www.ceremade.dauphine.fr/~xian/>
Title: Approximate Bayesian Computation for model selection
Abstract: Approximate Bayesian computation (ABC), also known as likelihood-free methods, has become a standard tool for the analysis of complex models, primarily in population genetics but also for complex financial models.
The development of new ABC methodology is undergoing a rapid increase in the past years, as shown by multiple publications, conferences and even softwares. While one valid interpretation of ABC based estimation is connected with nonparametrics, the setting is quite different for model choice issues. We examined in Grelaud et al. (2009) the use of ABC for Bayesian model choice in the specific of Gaussian random fields (GRF), relying on a sufficient property to show that the approach was legitimate.
Despite having previously suggested the use of ABC for model choice in a wider range of models in the DIY ABC software (Cornuet et al., 2008), we present in Robert et al. (PNAS, 2011) theoretical evidence that the general use of ABC for model choice is fraught with danger in the sense that no amount of computation, however large, can guarantee a proper approximation of the posterior probabilities of the models under comparison. In a more recent work (Marin et al., 2011), we expand on this warning to derive necessary and sufficient conditions on the choice of summary statistics for ABC model choice to be asymptotically consistent.
(Joint works with J.-M. Cornuet, A. Grelaud, J.-M. Marin, N. Pillai. and J. Rousseau)
Week 7 - March 1st: graduate students poster presentations.
Week 8 - March 8th
Speaker: Professor Eric Gautier (ENSAE)
<http://www.crest.fr/ses.php?user=2950>Title: High dimensional instrumental variables regression and confidence sets
Abstract: We consider an instrumental variables method for estimation in linear models with endogenous regressors in the high-dimensional setting where the sample size n can be smaller than the number of possible regressors K, and L>=K instruments. We allow for heteroscedasticity and we do not need a prior knowledge of variances of the errors. Our Self Tuning Instrumental Variables estimator is realized as a solution of a conic optimization program. We give upper bounds on the estimation error of the vector of coefficients in l_p-norms for 1<= p<= infty that hold with probability close to 1, as well as the corresponding confidence intervals. All results are non-asymptotic. These confidence intervals are meaningful under the assumption that the true structural model is sparse. We also present error bounds under approximate sparsity. In our IV regression setting, the standard tools from the literature on sparsity, such as the restricted eigenvalue assumption are inapplicable. Therefore, for our analysis we develop a new approach based on data-driven sensitivity characteristics. We show that, under appropriate assumptions, a thresholded STIV estimator correctly selects the non-zero coefficients with probability close to 1. The price to pay for not knowing which coefficients are non-zero and which instruments to use is of the order sqrt{log(L)} in the rate of convergence. We extend the procedure to deal with high-dimensional problems where some instruments can be non-valid. We obtain confidence intervals for non-validity indicators and we suggest a procedure, which correctly detects the non-valid instruments with probability close to 1.:
------------
Christine Stone
Administrative and Personal Assistant
Department of Statistics, University of Oxford
1 South Parks Road
Oxford OX1 3TG
Tel: +44 (0)1865 272866/60 Fax: +44 (0)1865 272595
P Please consider the environment - do you really need to print this email?
You may leave the list at any time by sending the command
SIGNOFF allstat
to [log in to unmask], leaving the subject line blank.
|