Abstract Text |
Large scale consortia amass numerous phenotypes, many of which are correlated. Compared to testing phenotypes independently, testing correlated phenotypes for association with genetic variants simultaneously yields higher power and the ability to identify pleiotropy. Additionally, genetic studies with numerous participants have population structure and relatedness. Thus, we need efficient models to test multiple, correlated phenotypes while accurately modeling sample correlation. Methods have been developed for genetic testing with multiple phenotypes, but limitations include allowing only one variance component in the model. Along with relatedness, it may be appropriate to include additional variance components to model, for example, a shared environment or household, or a specific relatedness matrix estimated from X chromosome genetic markers. We will extend existing linear mixed model algorithms for multiple phenotype association testing to allow for more than one variance component. Further, implementing a multi-phenotype test oftentimes requires using a different software package than that used for a single phenotype association test. The multivariate mixed model will be implemented in the GENESIS software package. GENESIS is an established suite of genetic analysis functions, and the association testing functions allow for heterogeneous variance and use of sparse matrices for large sample sizes. We will also create a workflow for the BioData Catalyst platform powered by Seven Bridges, which will be available to all users, enabling a seamless interface from single variant association testing to multiple phenotype testing. In the BioData Catalyst ecosystem, a user is able to execute existing workflows within a high performance cloud computing environment, removing the need for data transfer between collaborators or access to an on-site computing cluster. The utility of the multiple phenotype association method will be demonstrated in whole genome sequence data from the NHLBI TOPMed Program, which is a large, multi-ethnic sample set. Multiple trait association testing will be performed in participants from 13 studies that have seven red blood cell traits measured: hemoglobin, hematocrit, mean corpuscular hemoglobin, mean corpuscular hemoglobin concentration, mean corpuscular volume, red blood cell count, and red cell distribution width. The multiple phenotype association results will be compared to the results from testing each phenotype individually to identify pleiotropy and any novel associations.
|