Stat Dept Seminar, Mar 19, 2008, 03:00PM – 04:00PM, Speakman 318
Statistical Methods for Inferring Gene Regulatory Modules and Networks
Dr. Jun Xie, Department of Statistics, Purdue University
This talk is about probability and statistical methods for analysis of genomic data. Our focus is on a specific problem of inferring gene regulatory module, which is defined as a set of coexpressed genes that are regulated by a common set of transcription factors (proteins).
We propose a series of statistical methods that combine information from multiple types of genomic data, including DNA sequences, genome-wide location analysis (ChIP-chip experiments), and gene expression microarray. We start with a hidden Markov model of identifying protein binding sites in DNA sequences. The predictions are refined by regression analysis on gene expression microarray data and/or ChIP-chip binding data. In regression analysis, we formulate a variable selection problem and show that all available methods, including standard stepwise selection and LASSO/LARS, may fail to select the right set of covariates, due to complicated interdependence among genes. This biological application posts a challenge in probability and statistics and new methodologies will be of great interest.