Authors |
Pradeep Natarajan, James Perry, Akhil Pampana, Jai Broome, Jeff O’Connell, Fei Fei Wang, Alyna Khan, May Montasser, Lawrence Bielak, Daniel Weeks, Lisa Yanek, Juan Peralta, Stella Aslibekyan, Nicholette D. Allred, Brian E. Cade, Paul de Vries, Joshua Bis, Charles Kooperberg, James Wilson, Adolfo Correa, Debbie Nickerson, Gail Jarvik, L. Adrienne Cupples, Donna Arnett, Braxton Mitchell, Cathy Laurie, Stephen S. Rich, Jerome I. Rotter, Sekar Kathiresan, Cristen Willer, Gina M. Peloso; on behalf of the NHLBI TOPMed Lipids Working Group
|
Abstract Text |
Introduction
Genetic analyses of plasma lipids (total cholesterol, LDL-C, HDL-C, and triglycerides) have yielded fundamental biological, clinical, and therapeutic insights for coronary heart disease (CHD). Whole genome sequencing now permits the most comprehensive genetic analysis of plasma lipids across large sample sizes.
Methods
Deep-coverage (>30X) whole genome sequences were generated as a part of the NHLBI Trans-Omics for Precision Medicine (TOPMed) Program. Plasma lipids were obtained for these individuals. Variants with minor allele count (MAC) > 20 were each individually associated with plasma lipids accounting for age, age^2, sex, cohort, genotypically-derived kinship, and prevalent Amish founder mutations in the dataset. Analyses were performed using MMAP and OASIS. Based on prior simulation analyses, we assigned statistical significance if P < 1x10-8.
Results
Whole genome sequences and plasma lipids were obtained for 28,541 ethnically-diverse individuals across 14 cohorts (FHS, JHS, Amish, MESA, GENOA, GeneStar, SAS, SAFS, GOLDN, DHS, CFS, WHI, ARIC, CHS) and combined into a single dataset. 35.9M high quality, MAC>20 genomic variants were included. Of observed associations, four novel sites were detected (outside of a +/- 500 kb window of 250 previously reported significant variants). While being just outside of the defined window, the 15q15.3 lead variant, associated with triglycerides, was in strong linkage disequilibrium with a known associated variant. The 9q31.1 lead variant, associated with LDL-C, is a low-frequency (MAF 0.4%) synonymous SNP in RNF20. The 11q14.1 (MAF 0.4%) and 11q23.3 (MAF 0.2%) lead variants were in non-coding sequences and associated with triglycerides. The 11q14.1 variant is near PRCP, whose product is a regulator of energy expenditure and fat mass. The 11q23.3 variant is ~511 kb away from the APOC3-A4-A1 cluster but is not in linkage disequilibrium with previously associated variants.
Conclusions
Deep-coverage whole genome sequence association with plasma lipids in 28,541 ethnically-diverse individuals yields putatively novel associations even at sample sizes much smaller than larger array-based genome-wide association analyses.
|