Skip to main content

Population Genetics

Genetic and phenotypic association analyses of cardiometabolic traits in diverse African samples with whole-genome sequencing data

Authors
Daniel Hui*, Matt Hansen*, Daniel Harris, Michael McQuillan, Dan Ju, Alexander Platt, William Beggs, Sunungouko Wata Mpoloka, Gaonyadiwe George Mokone, Gurja Belay, Thomas Nyambo, Stephen Chanock, Meredith Yeager, TOPMed Consortium, Giorgio Sirugo, Marylyn D. Ritchie, Scott Williams, Sarah A. Tishkoff
Name and Date of Professional Meeting
American Society of Human Genetics, November 2023
Associated paper proposal(s)
Working Group(s)
Abstract Text
African populations demonstrate exceptional genetic and phenotypic diversity, due in part to their varied environments, lifestyles, and demographic history. We conducted genetic and phenotypic association analyses in 6,965 geographically and ethnically diverse Sub-Saharan African individuals (6,280 with whole-genome sequences from the NIH TOPMed consortium and 685 with genotypes from Illumina arrays), using 15 cardiometabolic phenotypes (range 686-6,854 individuals/trait). Each phenotype had at least one ethnicity with significantly differing mean values compared to the remaining cohort, such as short stature in the Baka rainforest hunter-gatherers of Cameroon, and high adiposity in the Herero pastoralists of Botswana. An analysis of ethnicity-sex interactions revealed several ethnic groups with significant sexual dimorphism for at least one cardiometabolic phenotype, such as Herero women having markedly higher body mass index than men. Comparison between the African cohort and African ancestry UK Biobank (UKBB) individuals showed the latter have higher mean values than any of the 53 African ethnic groups for multiple cardiometabolic measurements, including low density lipoprotein cholesterol (LDL), body fat percentage (BFP), and systolic blood pressure. We also found that phenotype-phenotype correlations differ between the UKBB and African cohort, as well as between African ethnicities. For example, BFP and LDL had low correlation in the UKBB (R=0.04) but showed a range of correlation among African groups, from R = 0.00 in the Maasai pastoralists of eastern Africa to R = 0.43 in the Agaw agriculturalists of Ethiopia. Genome-wide association analyses identified 76 significantly associated loci (p<5.0x10-8), with 14 passing a more stringent empirical threshold (p<3.0x10-9), including APOE and APOC1 loci for various blood lipids, PCSK9 for LDL, and CETP for high density lipoprotein cholesterol (HDL), as well as novel loci. Set-based rare variant analyses for loss-of-function variants found 12 gene-phenotype associations replicating known associations with PCSK9 and APOE for LDL and total cholesterol and uncovering several novel gene-trait associations for adiposity traits and HDL. Ongoing analyses include phenotype associations with subsistence and genetically inferred ancestry, replication of genetic associations, and gene-set enrichment. In total, these results offer insights into the genetic and phenotypic landscape of cardiometabolic traits in African populations. This work was supported by grant numbers: ADA 1–19-VSN-02, NIH grants 1R35GM134957, R01DK104339, and R01AR076241, and 1X01HL139409-01.

The extent to which augmenting extant reference panels with population-specific sequences improves imputation quality

Authors
Jenna C. Carlson
Mohanraj Krishnan
Shuwei Liu
Kevin Anderson
Jerry Z. Zhang
Hong Cheng
Take Naseri
Muagututi‘a Sefuiva Reupena
Satupa‘itea Viali
Ranjan Deka
Nicola L. Hawley
Stephen T. McGarvey
Daniel E. Weeks
Ryan L. Minster
Name and Date of Professional Meeting
American Society of Human Genetics Annual Meeting (November 1-5, 2023)
Associated paper proposal(s)
Working Group(s)
Abstract Text
Genotype imputation is fundamental to association studies, and even gold standard panels like TOPMed have limitations to the populations and variants for which they yield good imputation.

To quantify the impact that varying the number of population-specific sequences in the reference panel has on imputation quality, we constructed 6 in-house reference panels from 2,504 1000G samples plus varying numbers of Samoan samples (4, 24, 48, 96, 384, and 1,285) from whole-genome sequencing and compared them to the 1000G Phase III and TOPMed imputation panels. Each reference panel was used to impute genotype data for 1,897 Samoan participants who were not part of any reference panel. We examined average imputation quality (r2) and the number of well-imputed variants (r2 ≥ 0.8) on chromosomes 5 and 21 to assess performance and compared them to two gold-standard reference panels: TOPMed and 1000G Phase III. To further characterize variants that might gain the most in imputation accuracy, we also calculated LD scores split into low and high strata at the median value within MAF bins.

The 1000G + 1285 Samoan panel yielded > 200,000 more high-quality variants on chromosome 5 than the TOPMed panel, with 48,374 of these having a MAF ≥ 0.01. The largest gains were seen for lower-frequency variants with an up to 125% increase in well-imputed variants with MAF < 0.01 compared to the TOPMed imputation. Imputation quality increased as the number of Samoans represented in the panel increased. Panels with 48 or more Samoans included outperformed the TOPMed panel for all variants with MAF ≥ 0.001. The gains in imputation quality for the 1000G + 1285 Samoan reference panel compared to the TOPMed panel were greatest for low LD score variants.

For rs200884524, a variant on chromosome 5 associated with dyslipidemia and enriched in Polynesians, the imputation quality was highest (r2 = 0.89-0.95) for the reference panels that included Samoan haplotypes. Additionally, the imputed MAF from the reference panels with Samoans (0.207-0.222) was much closer to what is expected via targeted genotyping (0.202-0.233).

While not necessarily prescriptive for future studies, in this study we showed that as few as 48 population-specific participants added to 1000G yielded superior imputation quality to TOPMed. Our findings also demonstrated that panels containing Samoan-specific haplotypes improve the imputation of population-specific variants located in small LD blocks the most. These findings provide a framework to help future studies construct reference panels of their own to obtain high-quality imputation for genetic association studies.

Whole Genome Sequence Analysis of the Plasma Proteome in Black Adults

Authors
Daniel H. Katz*1, Usman A. Tahir*1, Alexander G. Bick*2, Akhil Pampana2, Debby Ngo1, Mark D. Benson1, Zhi Yu2, Jeremy M. Robbins1, Zsu-Zsu Chen1, Daniel E. Cruz1, Shuliang Deng1, Laurie Farrell1, Sumita Sinha1, Dongxiao Shen1, Yan Gao3, Michael E. Hall4, Adolfo Correa4, Russell P. Tracy5,Peter Durda5, Kent D. Taylor6, Yongmei Liu7, W. Craig Johnson8, Xiuqing Guo6, Jie Yao6, Yii-Der Ida Chen6, Ani W. Manichaikul9, 10, Deepti Jain11, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, Claude Bouchard12, Mark A. Sarzynski13, Stephen S. Rich9, Jerome I. Rotter6, Thomas J. Wang14, James G. Wilson1, Pradeep Natarajan2, 15, 16, and Robert E. Gerszten†1, 2
Name and Date of Professional Meeting
AHA Scientific Sessions (Nov 13-15, 2021)
Associated paper proposal(s)
Working Group(s)
Abstract Text
Introduction: Plasma proteins are critical mediators of cardiovascular processes and are the targets of many drugs. Previous efforts to characterize the genetic architecture of the plasma proteome have been limited by a focus on individuals of European descent and leveraged genotyping arrays and imputation. Hypothesis: Whole genome sequence analysis of the plasma proteome in individuals with greater African ancestry will increase power to identify novel genetic determinants. Methods: Proteomic profiling of 1,301 proteins was performed in 1852 Black adults from the Jackson Heart Study using aptamer-based proteomics (SomaScan®). Whole genome sequencing association analysis was ascertained for all variants with minor allele count ≥ 5. Results were validated using an alternative, antibody-based, proteomic platform (Olink®) as well as replicated in the Multi-Ethnic Study of Atherosclerosis and the HERITAGE Family Study. Results: We identify 569 genetic associations between 479 proteins and 438 unique genetic regions at a Bonferroni-adjusted significance level of 3.8 × 10-11. These associations include 134 novel locus-protein relationships and an additional 205 novel sentinel variant-protein relationships. Novel cardiovascular findings include new protein associations at the APOE gene locus including ZAP70 (sentinel single nucleotide polymorphism [SNP] rs7412-T, β = 0.61±0.05, p-value = 3.27 × 10-30) and MMP-3 (β = -0.60±0.05, p = 1.67 × 10-32), as well as a completely novel pleiotropic locus at the HPX gene, associated with nine proteins. Further, the associations suggest new mechanisms of genetically mediated cardiovascular disease linked to African ancestry; we identify a novel association between variants linked to APOL1 associated chronic kidney and heart disease and the protein CKAP2 (rs73885319-G, β = 0.34±0.04, p = 1.34 × 10-17) as well as an association between ATTR amyloidosis and RBP4 levels in community dwelling individuals without heart failure. Discussion: Taken together, these results provide evidence for the functional importance of variants in non-European populations, and suggest new biological mechanisms for ancestry-specific determinants of lipids, coagulation and myocardial function.
Back to top