Faculty Work Comprehensive List

Leveraging Summary Statistics to Make Inferences about Complex Phenotypes in Large Biobanks

Angela Gasdaska
Derek Friend
Rachel Chen
Jason Westra, Dordt CollegeFollow
Matthew Zawistowski
William Lindsey, Dordt College
Nathan L. Tintle, Dordt CollegeFollow

Document Type

Article

Publication Date

2019

Department

Mathematics, Statistics, and Computer Science

Keywords

privacy, biobank, genetics, genome-wide association study, single nucleotide variant, computational challenges, data security, phenotypes

Abstract

As genetic sequencing becomes less expensive and data sets linking genetic data and medical records (e.g., Biobanks) become larger and more common, issues of data privacy and computational challenges become more necessary to address in order to realize the benefits of these datasets. One possibility for alleviating these issues is through the use of already-computed summary statistics (e.g., slopes and standard errors from a regression model of a phenotype on a genotype). If groups share summary statistics from their analyses of biobanks, many of the privacy issues and computational challenges concerning the access of these data could be bypassed. In this paper we explore the possibility of using summary statistics from simple linear models of phenotype on genotype in order to make inferences about more complex phenotypes (those that are derived from two or more simple phenotypes). We provide exact formulas for the slope, intercept, and standard error of the slope for linear regressions when combining phenotypes. Derived equations are validated via simulation and tested on a real data set exploring the genetics of fatty acids.

Comments

Copyright © The Authors
Open Access

Source Publication Title

Pacific Symposium on Biocomputing

Publisher

World Scientific Publishing Company

Volume

First Page

391

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

Gasdaska, A., Friend, D., Chen, R., Westra, J., Zawistowski, M., Lindsey, W., & Tintle, N. L. (2019). Leveraging Summary Statistics to Make Inferences about Complex Phenotypes in Large Biobanks. Pacific Symposium on Biocomputing, 24, 391. Retrieved from https://digitalcollections.dordt.edu/faculty_work/1258

Download

Included in

Genetics and Genomics Commons

COinS

Faculty Work Comprehensive List

Leveraging Summary Statistics to Make Inferences about Complex Phenotypes in Large Biobanks

Document Type

Publication Date

Department

Keywords

Abstract

Comments

Source Publication Title

Publisher

Volume

First Page

Creative Commons License

Recommended Citation

Included in

Browse

Author Corner

Links

Faculty Work Comprehensive List

Leveraging Summary Statistics to Make Inferences about Complex Phenotypes in Large Biobanks

Authors

Document Type

Publication Date

Department

Keywords

Abstract

Comments

Source Publication Title

Publisher

Volume

First Page

Creative Commons License

Recommended Citation

Included in

Share

Browse

Author Corner

Links