Project guidelines

Alexandre Bouchard-Côté

Overview

The course project involves independent work on the topic of your choice but with the constraint that the bulk of the project should use or extend techniques taught in the course.

Logistics

Teams are encouraged, in which case you should outline the final report who did what. Expectation will grow linearly in the group size (meaning group of more than 2 is probably not a great idea in most cases!).
Submit on canvas, a pdf document of about 5 pages excluding references
Grading: I will base the grade on the same factors one would usually consider in a paper reviewing process (but taking into account the fact that the time frame is much shorter than the typical time to write a research paper). Is the goal clearly defined? Is it well motivated? Is the approach sound? Creative? Is the paper well-written? Are there interesting connections made to the existing literature?

Project timeline

February 4: prepare abstract, freeze teams, each team sends a 1 pager submitted on Canvas
~~February 25 at night~~: due date extended to March 18.

Some examples

Careful construction of a model for a real dataset. Defend the prior and likelihood using best practices. Validate implementation and statistical properties. Use the model to address a scientific/engineering problem.
Benchmark different MCMC methods for computing posterior distributions or computing marginal distributions.
Create a twist on an existing MCMC sampling algorithm, or a novel one. Show it is invariant with respect to the distribution of interest. Benchmark the performance of the method against one baseline using best practices.
A careful and scientific comparison of a Bayesian estimator with another one, either Bayesian or non-Bayesian. Review the literature on both side so as to be fair and critical to both sides of the comparison. State and defend the criteria you use. Consider calibration and M-open setups. Examples:
- Bayesian vs frequentist… regression/classification, feature selection, density estimation, survival analysis, …
- Is there some structure that can be exploited (e.g. informed by the data types for the covariates/features, groups of related features i.e. feature templates, hierarchical approaches, etc), to get better Bayesian methods on these generic classes of inference problems?
A Bayesian inference method over a non-standard data type. Acquire or write an efficient posterior inference method, either using a PPL or from scratch. Develop a novel Bayes estimator and implement it. Benchmark the Bayes estimator on synthetic data, comparing the performance with a naive baseline such as MAP. Examples:
- Types of graphs such as matchings
- Phylogenetic trees or networks
- Multiple sequence alignments
- Clustering or feature matrices