Diffuse Large B-Cell Lymphoma (DLBCL) is an aggressive cancer of the white blood cells, and its causes are not well understood. Pathologists hope to discover a molecular signature that is predictive of the disease's survival after adjusting for other features that are known to be important. A classical Cox Proportional Hazards model is inappropriate because the DLBCL data is high dimensional and many genomic features suffer from multicollinearity. Thus, we used a "Cox-LASSO" method to select a relevant subset of features correlated with survival. Instead of using all the features in the regression from the LASSO model, we predict using the first principal component (PC). The first PC is constructed by adapting Bair & Tibshirani's supervised PC regression method. This approach ensures a reduction in the dimensionality of the covariate space addressing the collinearity typically observed in the data. The prediction performance of the resulted model is evaluated by cross validation. This talk describes analyses performed in a joint collaboration with the Centre for Lymphoid Cancer at BC Cancer Agency.
Supervised principal components regression using a Cox-LASSO model
Tuesday, November 3, 2015 - 11:00
Derek Chiu, Statistics Master's Student (Co-op) - UBC
Room 4192, Earth Science Buildling, 2207 Main Mall