(Apologies for cross-posting)
Hi Everyone,
A post-ISBA conference workshop on Calibration and validation of complex computer models will be held at Macquarie University (near Sydney, Australia) on Sunday 27th and Monday 28th July. Leading and new researchers in the area will be sharing their research with us (see the program and abstracts below).
Please register now early bird finishes on the 24th May (less than two weeks away)!
Registration includes workshop admission, morning/afternoon tea and GST. Lunch and the workshop dinner are not included.
Early bird Late bird
ISBA/AMSI Members: $220 $275
Nonmembers: $330 $385
Full-time students: $110 $165
To register go to http://www.isba2008.sci.qut.edu.au/workshops2008.shtml#sydney and click on the pdf registration form.
Expressions of interest from people wishing to present posters are being considered now. Please contact Petra (details below).
Best wishes and hope to see you there,
Petra (on behalf of the organising committee).
Sunday 27th July
==================
09:30-10:00 Registration and Welcome
10:00-11:15 Tutorial 1: Dave Higdon
Bayesian approaches for combining experimental data and computer models
11:15-11:45 Break
11:45-13:15 Invited to Contribute Session 1
Jason Loeppky: Choosing the sample size of a computer experiment
James Gattiker: On design for parameter inference in emulators
13:15-14:30 Lunch
14:30-15:45 Tutorial 2: Jonathan Rougier
Bayes linear prediction with multiple treatments: application to avalanche modeling
15:45-16:00 Break
16:00-17:00 Young Investigators Session 1
Leanna House: Second order exchangeable emulators to assess initial condition uncertainty
Leonardo Soares Bastos: Diagnostics for Gaussian process emulators
1900- Workshop dinner
Monday 28th July
================
09:30-10:30 Invited Presentation 1: Susie Bayarri
Assessing the risk of catastrophic events by combining statistical and computer models
10:30-11:30 Young Investigators Session 2
Tiangang Cui: Statistical inversion and Markov Chain Monte Carlo methods in geothermal model calibration
Richard Wilkinson: Calibrating computer models with high dimensional output
11:30-12:00 Break
12:00-13:30 Invited to Contribute Session 2
Jeremy Oakley: Decision-theoretic sensitivity analysis for complex computer models
Colin Fox: TBA
13:30-14:30 Lunch
14:30-15:30 Invited Presentation 2: David van Dyk
Statistical Analysis of stellar evolution
15:30-16:30 Panel Discussion: Moderator Jim Berger
16:30 Close
Titles and abstracts:
-------------------------------------
Susie Bayarri
=============
Title: Assessing the risk of catastrophic events combining statistical and computer models
Authors: M.J. Bayarri, J. Berger, E. Calder, K. Dalbey, A. Patra, B. Pitman, E. Spiller, R. Wolpert
Abstract:
Risk assessment of rare natural hazards -- such as large volcanic pyroclastic flows -- is addressed. Assessment is approached through a combination of computer modeling, statistical modeling, and extreme-event probability computation. A computer model of the natural hazard is utilized to provide the needed extrapolation to unseen parts of the hazard space. Statistical modeling of the available data is needed to determine the initializing distribution for exercise of the computer model. In dealing with rare events, direct simulations involving the computer model are prohibitively expensive. Solution instead requires a combination of adaptive design of computer model approximations (emulators) and rare event simulation. The techniques that are developed for risk assessment are illustrated on a test-bed example involving volcanic flow.
Dave Higdon
===========
Title: Bayesian approaches for combining experimental data and
computer model simulations for statistical inference
Author: Dave Higdon, Los Alamos National Laboratory
Abstract:
By augmenting experiments with detailed simulation-based physical models one can greatly leverage the amount of information that even a limited set of experiments can provide. This tutorial describes Bayesian modeling and estimation techniques that may be used to combine these two sources of information. These methods include designing simulation campaigns, modeling simulation output, estimation - or calibration - of key simulation model parameters, and accounting for major sources of uncertainty. Various response surface models will be discussed, as will model formulations for combining the various sources of information.
Jonathan Rougier
================
Title: Bayes linear prediction with multiple treatments: application to avalanche modeling
Abstract:
We have steady-state snow velocity profiles from ten large-chute experiments, where each experiment takes place under different environmental conditions. Based on these we would like to predict the velocity profile across the full range of environmental conditions. This large number of observations and predictands poses challenges for fully-probabilistic methods, but can be easily handled within a Bayes linear approach. We show how multiple treatments can be incorporated into the 'standard' model-based inference, and illustrate a detailed elicitation for such an inference. This is joint work with Martin Kern at the Swiss Federal Institute for Snow and Avalanche Research, Davos.
David van Dyk
=============
Title: Statistical Analysis of Stellar Evolution
Authors: David A. van Dyk, Department of Statistics, University of California, Irvine,
Steven DeGennaro, Ted von Hippel, Department of Astronomy, University of Texas at Austin, William H. Jeffery, College of Engineering \& Mathematical Sciences, University of Vermont, and Nathan Stein and Elizabeth Jeffery, Department of Astronomy, University of Texas at Austin
Abstract:
Color Magnitude Diagrams (CMDs) are plots that compare the magnitudes (luminosities) of stars in different wavelengths of light (colors). High non-linear correlations among the mass, color and surface temperature of newly formed stars induce a long narrow curved point cloud in a CMD known as the main sequence. Aging stars form new CMD groups of red giants and white dwarfs. The physical processes that govern this evolution can be described with mathematical models and explored using complex computer models. These calculations are designed to predict the plotted magnitudes as a function of the parameters of scientific interest such as stellar age, mass, and metallicity. Here, we describe how we use the computer models as complex likelihood functions in a Bayesian analysis that requires sophisticated computing, corrects for contamination of the data by field stars, accounts for complications caused by binary stars, and aims to compare competing physics-based computer models of stellar evolution.
Colin Fox
==========
TBA
James Gattiker
==============
Title: On Design for Parameter Inference in Emulators
Abstract:
In the study of computer models, statistical approximations of simulation responses over a parameter space allow analytical approaches that are otherwise out of reach when simulations are expensive and data is sparse. Design for constructing accurate emulators has several open questions; we examine the interplay of the choice of correlation function, the inference of correlation function parameters, and the effect of predictive accuracy, on Gaussian process emulators. Our approach to design is to examine a hybrid method of pseudorandom sequences and optimal design based on optimizing Fisher Information for parameter inference. We present the results of simulation studies of parameter inference and design, and discuss the implications with respect to the problem of climate modeling.
Jason Loeppky
=============
Title: Choosing the Sample Size of a Computer Experiment
Authors: Jason Loeppky, Jerome Sacks and William Welch
Abstract:
In recent years virtual experiments implemented by a complex computer code or mathematical model are supplementing or even replacing physical experiments. The computer code mathematically describes the relationship between several input variables and one or more output variables. Often the computer models in question can be computationally demanding. Thus, direct evaluation of the code for optimization or validation is not possible in general. The general strategy is to build a statistical model to act a surrogate or an emulator of the true code. A long used rule of thumb for sample size takes a runs size that is 10 times the number of active dimensions. In this talk we investigate this rule of thumb for a variety of problems encountered in practice. In some cases we will show that increasing the sample size has a large effect on the prediction quality and in other cases increasing the sample size has little to no effect. These issues will be demonstrated using a model for polar ice caps and a model for the ligand activation of a G-Protein in yeast.
Jeremy Oakley
=============
Title: Decision-theoretic sensitivity analysis for complex computer models
Abstract:
We consider the use of computer models in decision-making, and use decision-theoretic arguments to conduct a sensitivity analysis based on the expected value of perfect information for quantifying the 'importance' of each uncertain input parameter in a model. Standard Gaussian process emulators are used for efficient computation, and we address the problem of quantifying uncertainty in the sensitivity analysis results due to the use of an emulator with limited model runs.
Tiangang Cui
=============
Title: Statistical Inversion and Markov Chain Monte Carlo Method in Geothermal Model Calibration.
Leanna House
=============
Title: Second Order Exchangeable Emulators to Assess Initial Condition Uncertainty
Authors: Leanna House and Michael Goldstein
Abstract:
We address the uncertainty of deterministic computer models that rely on both input parameters and initial conditions. We refer to such models as semi-deterministic. Purely deterministic computer models either do not have an initial condition or fix (without error bounds) the value for the initial condition so that the same output will result from one set of input parameter values, even when the model is implemented multiple times. Semi-deterministic models however, allow the condition to vary, and thus have the potential to produce more than one result per input. When multiple outcomes per input are present, current approaches rely primarily on summary statistics (e.g., mean and variance per input), and apply standard deterministic model uncertainty analysis approaches. However, inferences based solely on such statistics require implicitly strong assumptions which we are unwilling to make. Thus, we introduce the notion of latent computer model outcomes which correspond to the results of the semi-deterministic model when using the appropriate, but unknown, initial condition for the physical system of interest. The goal for this paper is to make inferences about the latent model given a sequence of realized semi-deterministic model evaluations. We consider the sequence elements to be second order exchangeable and use Bayes linear methods to assess the posterior expectation and variance of the latent model given the realised evaluations. We demonstrate our methods using semi-deterministic results from a galaxy formation model called Galform that relies on initial specifications of dark matter.
Leonardo Soares Bastos
======================
Title: Diagnostics for Gaussian Process Emulators
Authors: Leonardo S. Bastos and Anthony O'Hagan
Abstract:
Mathematical models, usually implemented in computer programs known as simulators, are widely used in all areas of science and technology to represent complex real-world phenomena. Simulators are often sufficiently complex that they take appreciable amounts of computer time or other resources to run. In this context, a methodology has been developed based on building a statistical representation of the simulator, known as an emulator. The principal approach to building emulators uses Gaussian processes. This work presents some diagnostics to validate and assess the adequacy of a Gaussian process emulator as surrogate for the simulator. These diagnostics are based on comparisons between simulator outputs and Gaussian process emulator outputs for some test data, known as validation data, defined by a sample of simulator runs not used to build the emulator. Our diagnostics take care to account for correlation between the validation data.
Richard Wilkinson
==================
Title: Calibrating computer models with high dimensional output.
Abstract:
I will consider the calibration of complex computer models which produce highly multivariate output, typically time-series or spatio-temporal fields. Directly emulating these models is a computationally demanding task, and may not be possible for models with very high dimensionality. An alternative approach is to reduce the number of dimensions using a basis representation, for example the principal components, and emulate the computer model output using this reduced latent space representation. However, the data reduction will not typically produce an accurate representation of the field data, and so it is necessary to perform any calibration on the data space rather than the latent space so that reconstruction error is accounted for in the model parameters. I will illustrate these ideas on the UVic Earth system climate model.
Dr Petra Graham
Department of Statistics
Division of Economic and Financial Studies
Macquarie University
Sydney NSW 2109
Australia
Ph: +61 2 9850 6138
Fax: +61 2 9850 7669
|