To join via Zoom: To join this seminar, please request Zoom connection details from headsec [at] stat.ubc.ca
Title: What can statisticians learn from the analysis of C.elegans data?
Abstract: In modern scientific setups we are faced with unprecedented challenges regarding how to process data efficiently and in a robust way. These challenges often reveal the brittleness of our current tools, dictating the need for new methods. In this talk I will describe new statistical and AI methods motivated by a pressing problem in neuroscience, the need for imaging entire brains at single-neuron resolution.
Specifically, I will present my contribution to NeuroPAL, a new breakthrough technology that enables a colorful imaging of every single neuron in the brain of the C.elegans worm. I will describe new methods for two difficult tasks arising in these datasets: neural segmentation and identification. These two tasks are related to an underlying deconvolution model, a mixture of gaussians, and classical methods such as the EM algorithm fall short. Behind these new methods there is a key statistical physics principle, the so-called Schrödinger bridge, a ‘thought experiment’ that realizes the solution of an entropy-regularized optimal transport problem. This thought experiment was proposed in 1932 but it has yet to percolate into the mainstream of statistics.
I will first describe some fundamental statistical properties of the Schrödinger bridge that I established. For example, when estimating it from samples, it enjoys the 1/sqrt n convergence rate, avoiding the curse of dimensionality. Second, I will introduce a new loss function based on this principle and show that it is a better optimization objective than the log-likelihood for model-based clustering, reducing pathologies such as bad local optima and inconsistency. In consequence, a new algorithm derived from this loss, Sinkhorn EM, attains better, more robust neural segmentation performance. After, I will comment on how these principles can be used to probabilistically identify neurons in C.elegans, leading to meaningful uncertainty quantification on this hard combinatorial setup. Finally, I will comment on how these novel methods have proven to be useful in other contexts such as deep learning.