To join this seminar virtually: Please request Zoom connection details from headsec [at] stat.ubc.ca.
Presentation 1
Time: 11:00am – 11:30am
Speaker: Giuseppe Tomio, UBC Statistics MSc student
Title: A new data driven framework for simulating mendelian randomization data
Abstract: Mendelian randomization (MR) is a causal inference method that allows biostatisticians to leverage DNA measurements to study causal effects with only observed data. Recent advancements like two-sample summary-level MR (TS SL MR) and projects such as IEU GWAS database have lowered the barrier for conducting MR studies and opened the opportunity to mine causal effects in large-scale data sources. In the first part of my presentation, I show that there is a mismatch between how modern TS SL MR data is and how articles that propose popular TS SL MR models conduct their simulations. Next, I propose my solution: a data driven simulation framework for MR data that aims to be realistic, interpretable and easy to use thanks to a complementary R package implementation. As for the results, I show that models perform far better in literature-based simulations compared to more realistic simulations based on my proposed framework. Lastly, I warn that the mismatch between simulated and real data along with the obtained results may lead researchers to have over optimistic expectations about models performance in real applications.
Presentation 2
Time: 11:30am – 12:00pm
Speaker: Jana Osea, UBC Statistics MSc student
Title: Enhancing the Robustness of Instrumental Variable Estimation with Potentially Invalid Instruments and its Application to Mendelian Randomization
Abstract: Causal relationships between exposures and outcomes are vital in fields like epidemiology and medicine as they provide valuable insight into disease mechanisms, informing effective interventions. However, a common problem when attempting to extract the causal relationship between an exposure and an outcome in observational studies is the presence of unmeasured confounding. This leads to the exposure being correlated with the error term known as endogeneity. In order to obtain consistent estimates of the causal effect in the presence of endogeneity, we may use instrumetal variables (IVs) which are correlated with outcome only through its effect on the exposure. This captures the relationship between the exposure and outcome that is unaffected by the endogeneity. The most common use of IVs in the field of epidemiology and medicine is Mendelian Randomization (MR) which uses genetic variants as IVs. However, the validity of the IVs is often questionable due to the presence of pleiotropy and linkage disequilibrium, where the genetic variants affect the outcome through pathways other than the exposure. Furthermore, exposure and outcome data are often contaminated by outliers. In this thesis, we propose a novel algorithm, the robustified some valid some invalid instrumental variable estimator (rsisVIVE), that obtains estimates of the causal effect of an exposure on an outcome in the presence of invalid IVs and high levels of endogeneity while tolerating large proportions of contamination. The algorithm is based on the sisVIVE algorithm of Kang et. al. (2016) but we propose using robust estimators in both stages of the sisVIVE. Simulation results show that the rsisVIVE more accurately estimates the causal parameter than the sisVIVE when IVs are weak and outperforms competitor IV estimators in all cases when there is contamination.