Document Type

Article

Publication Date

12-29-2017

Department

Mathematics, Statistics, and Computer Science

Keywords

gene regulation, transcriptomics, Bayesian inference

Abstract

An important question in many biological applications, is to estimate or classify gene activity states (active or inactive) based on genome-wide transcriptomics data. Recently, we proposed a Bayesian method, titled MultiMM, which showed superior results compared to existing methods. In short, MultiMM performed better than existing methods on both simulated and real gene expression data, confirming well-known biological results and yielding better agreement with fluxomics data. Despite these promising results, MultiMM has numerous limitations. First, MultiMM leverages co-regulatory models to improve activity state estimates, but information about co-regulation is incorporated in a manner that assumes that networks are known with certainty. Second, MultiMM assumes that genes that change states in the dataset can be distinguished with certainty from those that remain in one state. Third, the model can be sensitive to extreme measures (outliers) of gene expression. In this manuscript, we propose a modified Bayesian approach, which addresses these three limitations by improving outlier handling and by explicitly modeling network and other uncertainty yielding improved gene activity state estimates when compared to MultiMM.

Comments

The copyright holder for this preprint is the author/funder. All rights reserved. No reuse allowed without permission.

Source Publication Title

bioRxiv

Publisher

Cold Spring Harbor Laboratory

DOI

https://doi.org/10.1101/241000

Share

COinS