The Bayesian paradigm for statistical inference uses expert knowledge, formulated in terms of probability distributions of unknown parameters of interest. These distributions, called prior distributions, are combined with data to provide new information about parameters, via new parameter distributions called posterior distributions. One research theme centers on devising new Bayesian methodologies, i.e., new statistical models with which Bayesian inferences can provide particular scientific insight. Quantifying the statistical properties of such methods and contrasting with non-Bayesian alternatives is an active area of research. Bayesian methods can lead to computational challenges, and another research theme centers on efficient computation of Bayesian solutions. The development of computational techniques for determining posterior distributions, such as Monte Carlo methods, is a rich area of research activity, with particular emphasis on Markov Chain Monte Carlo methods and sequential Monte Carlo methods.
Research Areas
Bayesian Statistics
- Ben Bloem-Reddy
- Alexandre Bouchard-Côté
- Creagh Briercliffe
- Trevor Campbell
- Naitong Chen
- Kevin Chern
- Anthony Christidis
- Gian Carlo Di-Luvi
- Fanny DUPONT
- Nathaniel Wu Dyrkton
- Paul Gustafson
- Kevin Lam
- Miguel Biron Lattes
- Matteo Lepur
- Xinglong Li
- Tiange (Ivy) Liu
- Yongjin Park
- Hyeongcheol (Tom) Park
- Geoff Pleiss
- Evan Sidrow
- Nikola Surjanovic
- Vinky Wang
- Joe Watson
- Quanhan (Johnny) Xi
- Zuheng (David) Xu
- Yichen Zhang
Bioinformatics/Genomics/Genetics
Recent advances of -omics technologies have stimulated a large body of biomedical studies focused on the discovery and characterization of molecular mechanisms of various diseases. For example, many studies have been focused on the identification of genes to diagnose or predict cancer. The rapid expansion of complex and large -omics datasets has nourished the development of tailored statistical methods to address the challenges that have arisen in the field. Some examples are detection and correction of biases and artifacts in raw high throughput -omics data, identification of true signal among a large number of variables measured on a much smaller number of subjects, modeling of complex covariance structures, integration of diverse -omics datasets. Research in this area is characterized by multidisciplinary collaborations among researchers from Statistics, Computer Science, Medical Genetics, Molecular Biology, and other related fields.
Biostatistics
Many faculty members work on applying statistical methods to biomedical problems, ranging from analysing gene expression data to public health issues. Much of this work is done in conjunction with local hospitals (such as St Paul's) and research institutes (such as the BC Cancer Agency and the BC Genome Sciences Center). In the fall of 2009, we introduced the biostatistics option to our MSc program, an option that is joint with the School of Population and Public Health.
- Jonathan O.K. Agyeman
- Alexandre Bouchard-Côté
- Harlan Campbell
- Harper Xiaolan Cheng
- Gabriela V. Cohen Freue
- Sihaoyu Gao
- Lucy Gao
- Paul Gustafson
- Wakeel Adekunle Kasali
- Keegan Korthauer
- Ian Murphy
- Giuliano Netto Flores Cruz
- Jana Osea
- Yongjin Park
- John Petkau
- Marc Wettengel
- Lang Wu
- Xinyuan (Chloe) You
- Eugenia Yu
- Yichen Zhang
Data Science
Data Science information page.
Environmental and Spatial Statistics
The Department has a long history of research and collaborations in Environmental Statistics and in Spatial Statistics, beginning with Jim Zidek's pioneering work with the United States Environmental Protection Agency. Since that time, faculty have been involved in many research projects, such as the development of statistical techniques for the analysis of air pollution data to study concerns such as public health issues and global climate models. Current work modelling air pollution has resulted in the an interactive map for the World Health Organization, developed by an international team of researchers. Other recent research activities involve collaboration with marine mammal biologists, to study locations and behavior via continuous-time tracking devices.
Forest Products Stochastic Modeling Group
Since 2009, more than 60 researchers have been a part of this group, studying the properties of wood products, working on projects such as the development of engineering standards, monitoring for changes in product properties over time, subset selection methods for species grouping in the marketing of lumber and the duration of load effect in construction. The group is made up of statisticians from UBC and SFU - faculty, students and staff - and collaborating scientists at FPInnovations Vancouver, funded by Collaborative Research and Development Grants awards under NSERC’s Forest Sector R & D Initiative.
Forest products have a complex variability and, as a biomaterial, are inherently stochastic. Therefore, the group has analyzed forest product data using advanced statistical methods in areas such as survey sampling, survival analysis, nonparametric Bayesian analysis and the handling of big data. The group has made novel contributions to statistical science that transfer to other domains and has solved long standing problems in wood science. And something that rarely is the case - statisticians have run their own experiments and data collection.
Read more about the Forest Products Stochastic Modeling Group.
Modern Multivariate and Time Series Analysis
Modern multivariate and time series analyses go beyond the classical normality assumption by modelling data that could combine binary, categorical, extreme and heavy-tailed distributions. Dependence is modeled non-linearly, often in terms of copula functions or stochastic representations. Models for multivariate extremes arise from asymptotic limits. Characterization and modelling of dependence among extremes as well as estimation of probabilities of rare events are topics of on-going research. Advances in high-dimensional multivariate modelling have been achieved by the use of vine pair-copula constructions. Areas of application include biostatistics, psychometrics, genetics, machine learning, econometrics, quantitative risk management in finance and insurance, hydrology and geoscience.
Robust Statistics
Statistical procedures are called robust if they remain informative and efficient in the presence of outliers and other departures from typical model assumptions on the data. Ignoring unusual observations can play havoc with standard statistical methods and can also result in losing the valuable information gotten from unusual data points. Robust procedures prevent this. And these procedures are more important than ever since currently, data are often collected without following established experimental protocols. As a result, data may not represent a single well-defined population. Analyzing these data by non-robust methods may result in biased conclusions. To perform reliable and informative inference based on such a heterogeneous data set, we need statistical methods that can fit models and identify patterns, focusing on the dominant homogeneous subset of the data without being affected by structurally different small subgroups. Robust Statistics does exactly this. Some examples of applications are finding exceptional athletes (e.g. hockey players), detecting intrusion in computer networks and constructing reliable single nucleotide polymorphism (SNP) genotyping.
Statistical Learning
Statistical learning, sometimes called machine learning, is becoming ever more important as a component of data science, and department members have had active research in this area for more than a decade. Statistical learning methods include classification and regression (supervised learning) and clustering (unsupervised learning). Current research topics of faculty members and their graduate students include construction of phylogenetic trees in evolution, ensembles of models and sparse clustering. Applications include the search for novel pharmaceutical drugs and detection of biogenic patterns.
- Tomas Beuzen
- Ben Bloem-Reddy
- Alexandre Bouchard-Côté
- Creagh Briercliffe
- Kevin Chern
- Anthony Christidis
- Gabriela V. Cohen Freue
- Gian Carlo Di-Luvi
- Xin Ding
- Yidie Feng
- Lucy Gao
- Pramoda Sachinthana Jayasinghe
- Kevin Lam
- Jiaping(Olivia) Liu
- Daniel J. McDonald
- Geoff Pleiss
- Saifuddin Syed
- William J. Welch
- Quanhan (Johnny) Xi
- Xinyuan (Chloe) You
- Eugenia Yu
- Ruben H Zamar
- Yichen Zhang