Document Type


Publication Date



Mathematics, Statistics, and Computer Science


gene regulation, transcriptomics, Bayesian inference


An important question in many biological applications, is to estimate or classify gene activity states (active or inactive) based on genome-wide transcriptomics data. Recently, we proposed a Bayesian method, titled MultiMM, which showed superior results compared to existing methods. In short, MultiMM performed better than existing methods on both simulated and real gene expression data, confirming well-known biological results and yielding better agreement with fluxomics data. Despite these promising results, MultiMM has numerous limitations. First, MultiMM leverages co-regulatory models to improve activity state estimates, but information about co-regulation is incorporated in a manner that assumes that networks are known with certainty. Second, MultiMM assumes that genes that change states in the dataset can be distinguished with certainty from those that remain in one state. Third, the model can be sensitive to extreme measures (outliers) of gene expression. In this manuscript, we propose a modified Bayesian approach, which addresses these three limitations by improving outlier handling and by explicitly modeling network and other uncertainty yielding improved gene activity state estimates when compared to MultiMM.


The copyright holder for this preprint is the author/funder. All rights reserved. No reuse allowed without permission.

Source Publication Title



Cold Spring Harbor Laboratory