Faculty Work Comprehensive List

Machine Learning and Data Mining in Complex Genomic Data - A Review on the Lessons Learned in Genetic Analysis Workshop 19

Inke R. Konig, University of Lubeck
Jonathan Auerbach, Columbia University
Damian Gola, University of Lubeck
Elizabeth Held, Iowa State University
Emily R. Holzinger, National Institutes of Health
Marc-Andre Legault, University of Montreal
Rui Sun, The Chinese University of Hong Kong
Nathan L. Tintle, Dordt CollegeFollow
Hsin-Chou Yang, Academia Sinica

Document Type

Article

Publication Date

2-3-2016

Department

Mathematics, Statistics, and Computer Science

Keywords

genomes, machine learning, data mining, phenotypes, computational complexity

Abstract

In the analysis of current genomic data, application of machine learning and data mining techniques has become more attractive given the rising complexity of the projects. As part of the Genetic Analysis Workshop 19, approaches from this domain were explored, mostly motivated from two starting points. First, assuming an underlying structure in the genomic data, data mining might identify this and thus improve downstream association analyses. Second, computational methods for machine learning need to be developed further to efficiently deal with the current wealth of data.

In the course of discussing results and experiences from the machine learning and data mining approaches, six common messages were extracted. These depict the current state of these approaches in the application to complex genomic data. Although some challenges remain for future studies, important forward steps were taken in the integration of different data types and the evaluation of the evidence. Mining the data for underlying genetic or phenotypic structure and using this information in subsequent analyses proved to be extremely helpful and is likely to become of even greater use with more complex data sets.

Source Publication Title

BMC Genetics

Publisher

BioMed Central

Volume

17 (Supp 2)

Issue

DOI

10.1186/s12863-015-0315-8

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Recommended Citation

Konig, I. R., Auerbach, J., Gola, D., Held, E., Holzinger, E. R., Legault, M., Sun, R., Tintle, N. L., & Yang, H. (2016). Machine Learning and Data Mining in Complex Genomic Data - A Review on the Lessons Learned in Genetic Analysis Workshop 19. BMC Genetics, 17 (Supp 2) (1) https://doi.org/10.1186/s12863-015-0315-8

Link to Full Text

Find in your library

COinS

Faculty Work Comprehensive List

Machine Learning and Data Mining in Complex Genomic Data - A Review on the Lessons Learned in Genetic Analysis Workshop 19

Document Type

Publication Date

Department

Keywords

Abstract

Source Publication Title

Publisher

Volume

Issue

DOI

Creative Commons License

Recommended Citation

Browse

Author Corner

Links

Faculty Work Comprehensive List

Machine Learning and Data Mining in Complex Genomic Data - A Review on the Lessons Learned in Genetic Analysis Workshop 19

Authors

Document Type

Publication Date

Department

Keywords

Abstract

Source Publication Title

Publisher

Volume

Issue

DOI

Creative Commons License

Recommended Citation

Share

Browse

Author Corner

Links