FCCC LOGO Faculty Publications
Devarajan K , Zhou Y , Chachra N , Ebrahimi N
A supervised approach for predicting patient survival with gene expression data
Proc IEEE Int Symp Bioinformatics Bioeng. 2010 ;2010(5521718) :26-31
PMID: 20865131    PMCID: PMC2941901   
Back to previous list
Rapid development in genomics in recent years has allowed the simultaneous measurement of the expression levels of thousands of genes using DNA microarrays. This has offered tremendous potential for growth in our understanding of the pathophysiology of many diseases. When microarray studies also contain information about an outcome variable such as time to an event or death, one of the goals of an investigator is to understand how the expression levels of genes (covariates) relate to the time-to-event (referred to as survival time) in the course of a disease.In this article, we consider the case where the number of covariates, p, exceeds the number of observations, N, a setting typical of microarray gene expression data. For a given vector of responses representing survival times of N subjects and the corresponding p x N gene expression matrix, we examine the problem of predicting the survival probability when N << p. This is an ill-conditioned problem further compounded by the presence of possibly censored survival times. We propose a model that combines the partial least squares approach for dimensionality reduction with the accelerated failure time model, a widely used log-linear model for linking censored survival time to covariates. We develop parametric methods to account for censoring as well as for predicting patient survival probabilities. We illustrate the applicability of our methods using cancer microarray data and explore the biological relevance of our results using pathway analysis. Finally, we evaluate the performance of our methods using extensive simulation studies.
P30 CA006927-47/NCI NIH HHS/United States Proceedings / Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE). IEEE International Symposium on Bioinformatics and Bioengineering Proc IEEE Int Symp Bioinformatics Bioeng. 2010;2010(5521718):26-31.