Seminar Schedule in Google Calendar
iCal link
Room 4192, Earth Sciences Building (2207 Main Mall)
Tue 20th October 2015
Rosemary McCloskey
Rosemary McCloskey is a second year Master's student in the CIHR Strategic Training Program in Bioinformatics at UBC, working in Dr. Art Poon's lab at the BC Centre for Excellence in HIV/AIDS. She has an undergraduate degree in Mathematics from Simon Fraser University, and is the inaugural recipient of the Statistics Department Award in Data Science.
Phylogenetic clustering with a Markov-modulated Poisson process tle
Show Abstract

Most of the information that public health agencies and researchers have about emerging disease epidemics is obtained by on-the-ground epidemiology, that is, by asking infected people about where they went and who they contacted. Phylodynamics is an emerging area of research which aims to enrich this information using viral genomic data and bioinformatic methods. One application of phylodynamics is the identification of groups of epidemiologically related individuals, termed phylogenetic clustering. Here, we develop a novel clustering method which uses a Markov-modulated Poisson process, applied to a "family tree" of viruses, to identify parts of the population experiencing elevated transmission rates. We applied this method to anonymised viral genomic data sampled from almost 8000 HIV-infected individuals in British Columbia.                                 

Room 4192, Earth Sciences Building (2207 Main Mall)
Tue 13th October 2015
Mark Schmidt
Assistant Professor, UBC Department of Computer Science
Advances in fitting statistical models to huge datasets
Show Abstract
In the first part, I will consider the problem of minimizing a finite sum
of smooth functions. This is a ubiquitous computational problem in
statistics, as it frequently arises in various maximum likelihood and
regularized maximum likelihood frameworks. I will describe the stochastic
average gradient algorithm which, despite over 60 years of work on
stochastic gradient algorithms, is the first method to achieve the low
iteration cost of stochastic gradient methods while achieving a linear
convergence rate as in deterministic gradient methods that process the
entire dataset on every iteration.

In the second part, I will consider the even-more-specialized case where we
have a linearly-parameterized model (such as linear least squares or
logistic regression). I will talk about how coordinate descent methods,
though a terrible idea for minimizing general functions, are theoretically
and empirically well-suited to solving such problems. I will also discuss
how we can design clever coordinate selection rules, that are much more
efficient than the classic cyclic and randomized choices.

*Bio: * Mark Schmidt has been an assistant professor in the Department of
Computer Science at the University of British Columbia since 2014. His
research focuses on developing faster algorithms for large-scale machine
learning, with an emphasis on methods with provable convergence rates and
that can be applied to structured prediction problems. From 2011 through
2013 he worked at the École normale supérieure in Paris on inexact and
stochastic convex optimization methods. He finished his M.Sc. in 2005 at
the University of Alberta working as part of the Brain Tumor Analysis
Project, and his Ph.D. in 2010 at the University of British Columbia
working with Kevin Murphy on graphical model structure learning with
L1-regularization. He has also worked at Siemens Medical Solutions on heart
motion abnormality detection, with Michael Friedlander in the Scientific
Computing Laboratory at the University of British Columbia on
semi-stochastic optimization methods, and with Anoop Sarkar at Simon Fraser
University on large-scale training of natural language models.

Room 4192, Earth Sciences Building (2207 Main Mall)
Thu 8th October 2015
Communicating with Data Using Simplified Models and Uncertainty Quantification
Show Abstract

In the first part of the talk we explore ways to predict mortality in critical care situations, such as ICUs. Current models use a small number of variables, no temporal features, and are regression based with manual variable selection and weighting. We develop a univariate flagging algorithm (UFA) that predicts well, scales to a large number of variables, is robust to missing data, and easy to interpret and visualize.  While Random Forests, etc. can be competitive with UFA in these situations, they are a black box to the practitioners using them.
In the second part we consider methods to quantify potential uncertainty in plots and images.The basic idea is to find a way to remove structure from the image, bootstrap what is left, and then restore the structure leading to, say, and 1000 images.The Earth Mover’s Distance allows us to compute the distances between these plots and optimization algorithms allow us to order the plots and then find, say the lower extreme, middle, and upper extreme to visualize the uncertainty that may be present in the plot or image.

a place of mind, The University of British Columbia

Department of Statistics

Department of Statistics, University of British Columbia
3182 Earth Sciences Building
2207 Main Mall
Vancouver, BC, Canada V6T 1Z4
Tel: 604.822.0570
Fax: 604.822.6960

Emergency Procedures | Accessibility | Contact UBC | © Copyright The University of British Columbia