Document Type

Conference Proceeding

Publication Date



Mathematics, Statistics, and Computer Science


bacteria, genomes, metabolism, amino acid sequence, cluster analysis


With continued rapid growth in the number and quality of fully sequenced and accurately annotated bacterial genomes, we have unprecedented opportunities to understand metabolic diversity. We selected 101 diverse and representative completely sequenced bacteria and implemented a manual curation effort to identify 846 unique metabolic variants present in these bacteria. The presence or absence of these variants act as a metabolic signature for each of the bacteria, which can then be used to understand similarities and differences between and across bacterial groups. We propose a novel and robust method of summarizing metabolic diversity using metabolic signatures and use this method to generate a metabolic tree, clustering metabolically similar organisms. Resulting analysis of the metabolic tree confirms strong associations with well-established biological results along with direct insight into particular metabolic variants which are most predictive of metabolic diversity. The positive results of this manual cu ration effort and novel method development suggest that future work is needed to further expand the set of bacteria to which this approach is applied and use the resulting tree to test broad questions about metabolic diversity and complexity across the bacterial tree of life.


Presented at the 2017 Pacific Symposium on Biocomputing held on the Big Island of Hawaii, January 3-8, 2017.

Source Publication Title

Proceedings of the Pacific Symposium on Biocomputing


World Scientific Publishing Company



First Page




Included in

Microbiology Commons