To join this seminar virtually: Please request Zoom connection details from ea [at] stat.ubc.ca.
Abstract: Model selection in Gaussian processes scales prohibitively with the size of the training dataset, both in time and memory. While many approximations exist, all incur inevitable approximation error. Recent work accounts for this error in the form of computational uncertainty, which enables -- at the cost of quadratic complexity -- an explicit tradeoff between computational efficiency and precision. Here we extend this development to model selection, which requires significant enhancements to the existing approach, including linear-time scaling in the size of the dataset. We propose a novel training loss for hyperparameter optimization and demonstrate empirically that the resulting method can outperform SGPR, CGGP and SVGP, state-of-the-art methods for GP model selection, on medium to large-scale datasets. Our experiments show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU. As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty -- a fundamental prerequisite for optimal decision-making.
Bio: Jonathan Wenger is a postdoctoral research scientist at Columbia University's Department of Statistics and Zuckerman Institute working with Prof. John Cunningham. He earned a PhD in Computer Science from the University of Tübingen under the supervision of Prof. Philipp Hennig. Jonathan’s research focuses on resource-efficient methods for large-scale probabilistic machine learning. Much of his work contributes to the field of probabilistic numerics, which views numerical algorithms through the lens of probabilistic inference. This perspective enables the acceleration of learning algorithms via an explicit trade-off between computational efficiency and predictive precision.