To join this seminar virtually: Please request Zoom connection details from ea [at] stat.ubc.ca.
Presentation 1
Time: 3:00pm – 3:30pm
Speaker: Hannah Bobst, UBC Statistics MSc student
Title: Analysis of Density Ratio Model Performance for Quantile Estimation
Abstract: Quantiles are important descriptive statistics in many applications. In practice, quantiles must be estimated using a representative sample from the population of interest. While parametric estimators are preferable in general, they become inaccurate in the case of even minor model misspecification. Nonparametric estimators, which requires no specific model, are common alternative to parametric estimators. However, empirical-based estimators often require large sample sizes to attain satisfactory precision. The density ratio model (DRM) is a semi-parametric model which balances the trade-off between model misspecification risks and statistical efficiency in the presence of multiple related populations. This model allows for related populations to be used alongside the population of interest, which is particularly beneficial when the sample of interest is not large enough. This paper examines the perceived benefit of the DRM quantile estimator. We compare the DRM estimator with the parametric and empirical based estimators based on their bias and variance via simulation. We created five scenarios in terms of sample sizes. We generated data from the normal and gamma distributions, but the distribution information is only used for parametric estimation. We also generated data the same distributions but added some noise. Namely, the model is now mildly misspecified. We confirm that the parametric estimator is preferred under the true model, while DRM estimator is not far behind and it improves substantially over the nonparametric estimator. When the parametric model is wrong, DRM estimator is the overall winner. We also repeated the simulation using data of daily price changes of some technology stocks and found the DRM estimator has the best overall performance.
Presentation 2
Time: 3:30pm – 4:00pm
Speaker: Nathaniel Dyrkton, UBC Statistics MSc student
Title: Integrating representative and non-representative survey data for efficient inference
Abstract: Non-representative surveys are commonly used and widely available but suffer from selection bias that generally cannot be entirely eliminated using weighting techniques. Instead, we propose a Bayesian method to synthesize longitudinal representative and unbiased surveys with non-representative biased surveys by estimating the degree of selection bias over time. We show using a simulation study that synthesizing biased and unbiased surveys together out-performs using the unbiased surveys alone, even if the selection bias may evolve in a complex manner over time. Using COVID-19 vaccination data, we are able to synthesize two large sample biased surveys with an unbiased survey to reduce uncertainty in now-casting and inference estimates while simultaneously retaining the empirical credible interval coverage. Ultimately, we are able to conceptually obtain the properties of a large sample unbiased survey if the assumed unbiased survey, used to anchor the estimates, is unbiased for all time-points.