To Join Via Zoom: To join this seminar, please request Zoom connection details from headsec [at] stat.ubc.ca.
Abstract: MCMC methods are a popular tool in computation science used to evaluate expectations with respect to complex probability distributions over general state spaces. They work by averaging over the trajectory of a Markov chain stationary with respect to the target distribution. In theory, the MCMC algorithms converge asymptotically, but, in practice, for challenging problems where the target distributions are high-dimensional with well-separated modes, MCMC algorithms can get trapped exploring local regions of high probability and suffer from poor mixing.
Physicists and statisticians independently introduced parallel tempering (PT) algorithms to tackle this issue. PT delegates the task of exploration to additional annealed chains running in parallel with better mixing properties. They then communicate with the target chain of interest and help discover new unexplored regions of the sample space. Since their introduction in the 90s, PT algorithms are still extensively used to improve mixing in challenging sampling problems arising in statistics, physics, computational chemistry, phylogenetics, and machine learning.
The classical approach to designing PT algorithms was developed using a reversible paradigm that is difficult to tune and deteriorates in performance when too many parallel chains are introduced. This talk will introduce a new non-reversible paradigm for PT that dominates its reversible counterpart while avoiding the performance collapse endemic to reversible methods. We will then establish near-optimal tuning guidelines and efficient black-box methodology scalable to GPUs. Our work out-performs state-of-the-art PT methods and has been used at scale by researchers to study the evolutionary structure of cancer and discover magnetic polarization in the photograph of the supermassive black hole M87.