News & Events

Subscribe to email list

Please select the email list(s) to which you wish to subscribe.

User menu

You are here

Design and Analysis of Computer Experiments: Large Datasets and Multi-Model Ensembles

Tuesday, July 19, 2022 - 11:00 to 12:00
Sonja Isberg Surjanovic, UBC Statistics PhD Student

To Join Via Zoom: To join this seminar, please request Zoom connection details from headsec [at]

Abstract: Computer models are used as replacements for physical experiments in a wide variety of applications. Nevertheless, direct use of the computer model for the ultimate scientific objective is often limited by the complexity and cost of the model. Historically, Gaussian process (GP) regression has proven to be the almost ubiquitous choice for a fast statistical emulator for such a computer model, due to its flexible form and analytical expressions for predictive uncertainty.

In the first part of this dissertation, we consider complications that arise when the design is moderate to large. Fitting a GP regression can be computationally intractable for even moderate designs, due to computing time increasing with the cube of the design size. We propose a new solution to this problem: adaptive design and analysis via partitioning trees (ADAPT). By taking a data-adaptive approach to the development of a design, and choosing to partition the space in the regions of highest variability, we obtain a higher density of points in these regions and hence accurate prediction for complex computer models.

Next, we consider the scenario where multiple computer models are available for predicting the same physical process—known as multi-model ensembles (MMEs). Such ensembles are common in many applications, such as climate modelling and weather prediction. We present a new statistical methodology for combining output from such models to best describe the underlying physical process, using field data to estimate the weights as- signed to each model. The methodology allows us to make predictions with appropriate measures of uncertainty. Additionally, the weights are allowed to vary with the inputs and thus represent the changing relative importance between the computer models throughout the input space. The methodology is applied to ice sheet models for the deglaciation of North America. Finally, we address several considerations that arise when the MME field data are binary. A new MME model formulation is presented, and applied to ice absence/presence data in the deglaciation application.

In summary, this dissertation presents new methods for two scenarios prevalent in the design and analysis of computer experiments: large designs, and the presence of multiple computer models.