Mathematics, Statistics, and Computer Science
gene regulation, transcriptomics, Bayesian inference
An important question in many biological applications, is to estimate or classify gene activity states (active or inactive) based on genome-wide transcriptomics data. Recently, we proposed a Bayesian method, titled MultiMM, which showed superior results compared to existing methods. In short, MultiMM performed better than existing methods on both simulated and real gene expression data, confirming well-known biological results and yielding better agreement with fluxomics data. Despite these promising results, MultiMM has numerous limitations. First, MultiMM leverages co-regulatory models to improve activity state estimates, but information about co-regulation is incorporated in a manner that assumes that networks are known with certainty. Second, MultiMM assumes that genes that change states in the dataset can be distinguished with certainty from those that remain in one state. Third, the model can be sensitive to extreme measures (outliers) of gene expression. In this manuscript, we propose a modified Bayesian approach, which addresses these three limitations by improving outlier handling and by explicitly modeling network and other uncertainty yielding improved gene activity state estimates when compared to MultiMM.
Source Publication Title
Cold Spring Harbor Laboratory
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Disselkoen, C., Hekman, N., Gilbert, B., Benson, S., Anderson, M., DeJongh, M., Best, A., & Tintle, N. L. (2017). Improvements to Bayesian Gene Activity State Estimation from Genome-Wide Transcriptomics Data. bioRxiv https://doi.org/https://doi.org/10.1101/241000