Document Type

Conference Proceeding

Publication Date

1-2017

Department

Mathematics, Statistics, and Computer Science

Keywords

data sets, gene expression, transcriptomics

Abstract

Gene set analysis methods continue to be a popular and powerful method of evaluating genome-wide transcriptomics data. These approach require a priori grouping of genes into biologically meaningful sets, and then conducting downstream analyses at the set (instead of gene) level of analysis. Gene set analysis methods have been shown to yield more powerful statistical conclusions than single-gene analyses due to both reduced multiple testing penalties and potentially larger observed effects due to the aggregation of effects across multiple genes in the set. Traditionally, gene set analysis methods have been applied directly to normalized, log-transformed, transcriptomics data. Recently, efforts have been made to transform transcriptomics data to scales yielding more biologically interpretable results. For example, recently proposed models transform log-transformed transcriptomics data to a confidence metric (ranging between 0 and 100%) that a gene is active (roughly speaking, that the gene product is part of an active cellular mechanism). In this manuscript, we demonstrate, on both real and simulated transcriptomics data, that tests for differential expression between sets of genes using are typically more powerful when using gene activity state estimates as opposed to log-transformed gene expression data. Our analysis suggests further exploration of techniques to transform transcriptomics data to meaningful quantities for improved downstream inference.

Comments

Presented at the 2017 Pacific Symposium on Biocomputing, held on the Big Island of Hawaii, January 3-7, 2017.

Source Publication Title

Proceedings of the Pacific Symposium on Biocomputing

Publisher

World Scientific Publishing Company

Volume

22

First Page

449

DOI

10.1142/9789813207813_0042

Share

COinS