When additional variables are measured on a subsample from an existing cohort, we can fit models using survey-sampling approaches. In some settings we can also estimate by semiparametric maximum likelihood, profile likelihood, or similar techniques, leading to a semi parametric efficient estimator. For example, under case-control sampling we can use weighted logistic regression or unweighted logistic regression. The common wisdom is that the weighted estimators are inefficient because of the variation in the weights. I will argue that this is neither true nor helpful. By considering contiguous model misspecification I will show that the efficient estimator gains its extra precision from relying more heavily on the model, and this is true in a quantitative sense, not merely as a heuristic.
Estimation under nearly-correct models
Thursday, November 27, 2014 - 16:00
Dr. Thomas Lumley, Professor of Biostatistics, University of Auckland, New Zealand
Room 4192, Earth Sciences Building (2207 Main Mall)