**There will be a pre-talk reception in the Woodward IRC lobby at 10:30am.**

**************

**Abstract:** Since this is a student-invited seminar, I'm going to highlight three research projects led by my three senior PhD students. Each project is motivated by a distinct problem in biology.

First, calcium imaging data is transforming the field of neuroscience by making it possible to assay the activities of large numbers of neurons simultaneously. For each neuron, the resulting "fluorescence trace" can be seen as a noisy surrogate of its spikes over time. In order to deconvolve a fluorescence trace into the underlying spike times, we consider an auto-regressive model for calcium dynamics. This leads naturally to a seemingly intractable $\ell_0$ optimization problem. I will show that it is in fact possible to efficiently solve this optimization problem for the global optimum, leading to substantial improvements over competing approaches. I will also talk about quantifying uncertainty associated with these spike estimates.

Second, across many areas of biology, it is becoming increasingly common to collect "multi-view data": that is, data in which multiple data types (e.g. gene expression, DNA sequence, clinical measurements) have been measured on a single set of observations (e.g. patients). I will consider the following question: given a set of n observations with measurements on L data types, can a single clustering of the n observations be defined on all L data types, or does each data type have its own clustering of the observations? To answer this question, I will introduce a general framework for modeling multi-view data, as well as hypothesis tests that can be used in order to characterize the extent to which the clusterings on each of the L data types are the same or different.

Finally, I will consider a fundamental question that arises in the analysis of microbial ecology data: how can we determine whether the abundance of a given taxon differs across conditions?

Sean Jewell, Lucy Gao, and Bryan Martin are 5th year PhD students at University of Washington who carried out the work described in this talk.

**************

This talk is supported by the van Eeden fund, the Department of Statistics, and PIMS.