Abstract:In many current applications scientists can easily measure a very large number of variables (for example, several thousands of gene expression levels) some of which are expected be useful to explain or predict a specific response variable of interest. These potential explanatory variables are most likely to contain redundant or irrelevant information, and in many cases, their quality and reliability may be suspect. We developed a penalized robust regression estimator that can be used to identify a useful subset of explanatory variables to predict the response, while protecting the resulting estimator against possible aberrant observations in the data set. Using an Elastic Net penalty, the proposed estimator can be used to select variables, even in cases with more variables than observations or when many of the candidate explanatory variables are correlated. In this talk, I will present the new estimator and an algorithm to compute it. I will also illustrate its performance in a simulation study. This is joint work with Professor Matias Salibian-Barrera and our student David Kepplinger.