Paired and Unpaired Comparison and Clustering with Gene Expression Data | UBC Department of Statistics

Skip to main content Skip to main navigation

Faculty of ScienceDepartment of Statistics

Subscribe to email list

Title	Paired and Unpaired Comparison and Clustering with Gene Expression Data
Publication Type	Journal Article
Year of Publication	2002
Authors	Bryan, J, Pollard, KS, Van Der Laan, MJ
Journal	Statistica Sinica
Volume	12
Pagination	87 - 110
Abstract	We have previously described a statistical framework for using gene expression data from cDNA microarrays to select meaningful subsets of genes and to place genes into clusters (van der Laan and Bryan (2001)). In this paper we extend the methodology to the setting in which expression data is collected on a common set of $p$ genes from either two observations within a subject (paired), or on subjects from two subpopulations (unpaired). We present simulation results that illustrate important issues encountered with cluster analysis in gene expression data. In particular, we see that sampling variability of the covariance structure and the presence of unrelated genes can have a strong impact on clustering algorithms and measures of cluster strength. We discuss ways to address this issue, including the application of a hybrid clustering method which incorporates both partitioning and collapsing steps. The hybrid methodology is illustrated on a cancer cell line data set with two types of cancer. We also present a method for selecting significantly differently expressed genes using a null distribution. Finally, we present theoretical results relating to sample size and consistency in this setting.
URL	http://www3.stat.sinica.edu.tw/statistica/j12n1/j12n15/j12n15.htm

Website development by Checkmark Media.

UBC Department of Statistics