Alexandre Bouchard-Côté

The course project involves independent work on the topic of your choice **but** with the constraint that the bulk of the project should use or extend techniques taught in the course.

- Teams are encouraged, in which case you should outline the final report who did what. Expectation will grow linearly in the group size (meaning group of more than 2 is probably not a great idea in most cases!).
- Submit by email, with subject line ‘STAT 520 Final Project’, a pdf document of about 5 pages excluding references
- Grading: I will base the grade on the same factors one would usually consider in a paper reviewing process (but taking into account the fact that the time frame is much shorter than the typical time to write a research paper). Is the goal clearly defined? Is it well motivated? Is the approach sound? Creative? Is the paper well-written? Are there interesting connections made to the existing literature?

- Week of February 1: optional discussion during office hour on the proposed projects
- February 9: during lecture/office hours, be ready to discuss potential ideas
- February 11: prepare abstract, freeze teams, each team sends a 1 pager submitted on Canvas
- April 23 at night: due date.

- Careful construction of a model for a real dataset. Defend the prior and likelihood using best practices. Validate implementation and statistical properties. Use the model to address a scientific/engineering problem.
- Benchmark different MCMC methods for computing posterior distributions or computing marginal distributions.
- Create a twist on an existing MCMC sampling algorithm, or a novel one. Show it is invariant with respect to the distribution of interest. Benchmark the performance of the method against one baseline using best practices.
- A careful and scientific comparison of a Bayesian estimator with another one, either Bayesian or non-Bayesian. Review the literature on both side so as to be fair and critical to both sides of the comparison. State and defend the criteria you use. Consider calibration and M-open setups. Examples:
- Bayesian vs frequentist… regression/classification, feature selection, density estimation, survival analysis, …
- Is there some structure that can be exploited (e.g. informed by the data types for the covariates/features, groups of related features i.e. feature templates, hierarchical approaches, etc), to get better Bayesian methods on these generic classes of inference problems?

- A Bayesian inference method over a non-standard data type. Acquire or write an efficient posterior inference method, either using a PPL or from scratch. Develop a novel Bayes estimator and implement it. Benchmark the Bayes estimator on synthetic data, comparing the performance with a naive baseline such as MAP. Examples:
- Types of graphs such as matchings
- Phylogenetic trees or networks
- Multiple sequence alignments
- Clustering or feature matrices