News & Events

Subscribe to email list

Please select the email list(s) to which you wish to subscribe.

User menu

You are here

Two UBC Statistics MSc student presentations (Johnny Xi & Naitong Chen)

Tuesday, July 12, 2022 - 11:00 to 12:00
Johnny Xi, UBC Statistics MSc student; Naitong Chen, UBC Statistics MSc student
Zoom / ESB 4192

To join via Zoom: Please register here.

Presentation 1

Time: 11:00am – 11:30am

Speaker: Johnny Xi, UBC Statistics MSc student

Title: Indeterminacy in Latent Variable Models: Characterization and Strong Identifiability

Abstract: The history of latent variable models spans nearly 100 years, from factor analysis to modern unsupervised machine learning. An enduring goal is to interpret the latent variables as "true" factors of variation, unique to a sample. Unfortunately, modern non-linear methods are wildly underdetermined, leading to many possible, equally valid solutions, even in the limit of infinite data. I will describe a theoretical framework that rigourously formulates the uniqueness problem as statistical identifiability, unifying existing progress towards this goal. The framework explicitly characterizes the sources of non-identifiability, making it possible to design strongly identifiable latent variable models in a transparent way. Using insights derived from the framework, our work proposes two flexible non-linear models with unique latent variables.

Presentation 2

Time: 11:30am – 12:00pm

Speaker: Naitong Chen, UBC Statistics MSc student

Title: Bayesian Inference via Sparse Hamiltonian Flows

Abstract: A Bayesian coreset is a small, weighted subset of data that replaces the full dataset during Bayesian inference, with the goal of reducing computational cost. Although past work has shown empirically that there often exists a coreset with low inferential error, efficiently constructing such a coreset remains a challenge. Current methods tend to be slow, require a secondary inference step after coreset construction, and do not provide bounds on the data marginal evidence. In this work, we introduce a new method—sparse Hamiltonian flows—that addresses all three of these challenges. The method involves first subsampling the data uniformly, and then optimizing a Hamiltonian flow parametrized by coreset weights and including periodic momentum quasi-refreshment steps. Theoretical results show that the method enables an exponential compression of the dataset in a representative model, and that the quasi-refreshment steps reduce the KL divergence to the target. Real and synthetic experiments demonstrate that sparse Hamiltonian flows provide accurate posterior approximations with significantly reduced runtime compared with competing dynamical-system-based inference methods