Skip to main content

Analysis

Causal variant effect sizes of complex traits differ between populations

Authors
S. Musharoff1,2, R. Patel1, J. Spence1, H. Pimentel3, C. Tcheandjieu4,2, H. Mostafavi1, N. Sinnott-Armstrong1,2, S. L. Clarke4,2, C. Smith5, VA Million Veteran Program, P. P. Durda6, K. D. Taylor7, R. Tracy8, Y. Liu9, C. Johnson10, S. S. Rich11, J. I. Rotter12, F. Aguet13, K. G. Ardlie13, S. Gabriel13, D. Nickerson10, J. D. Smith14, P. Tsao15, M. Przeworski16, S. B. Montgomery1, T. Assimes4,2, J. Pritchard1

1 Stanford Univ, Stanford, CA, USA, 2 VAPAHCS, Palo Alto, CA, USA, 3 University of California, Los Angeles, Los Angeles, CA, USA, 4 Stanford Univ Sch Medicine, Stanford, CA, USA, 5 Genetics, Stanford University, Stanford, CA, USA, 6 Univ Vermont, Colchester, VT, USA, 7 TGPS, Lundquist Institute, Harbor-UCLA Med Ctr, Torrance, CA, USA, 8 University of Vermont, Burlington, VT, USA, 9 Duke Molecular Physiology Institute, Durham, NC, USA, 10 University of Washington, Seattle, WA, USA, 11 Center for Public Health Genomics, Univ Virginia, Charlottesville, VA, USA, 12 Inst Translation Genomics & Population Sci, Lundquist Institute, Harbor-UCLA Med Ctr, Torrance, CA, USA, 13 Broad Institute of MIT and Harvard, Cambridge, MA, USA, 14 Genome Sciences, Univ of Washington, Seattle, WA, USA, 15 Research Administration, VAPAHCS, Palo Alto, CA, USA, 16 Biological Sciences, Columbia Univ, New York, NY, USA.
Name and Date of Professional Meeting
American Society of Human Genetics Annual Meeting (October 19, 2021)
Associated paper proposal(s)
Working Group(s)
Abstract Text
Despite the growing number of genome-wide association studies for complex traits, it remains unclear whether effect sizes of causal genetic variants differ between populations. Effect sizes of causal variants can differ between populations due to gene-by-gene or gene-by-environment interactions, which have important implications for the study of complex traits and for the use of polygenic scores. However, comparing causal variant effect sizes is challenging: causal variants are hard to identify, and comparisons of their tag SNPs’ effect sizes are confounded by differences in linkage disequilibrium (LD) patterns and allele frequencies between ancestries. Here, we develop an approach to assess causal variant effect size differences without needing to identify the causal variants themselves. Specifically, we leverage the fact that segments of European ancestry shared between European-American and admixed African-American individuals have the same LD patterns and allele frequencies, such that comparisons of effect sizes in these regions are not confounded by these factors. We apply this approach to gene expression in the Multi-Ethnic Study of Atherosclerosis (MESA) and blood lipid levels in the Million Veteran Program (MVP). First, we find that global ancestry, local ancestry, and genotype-by-local-ancestry interactions all significantly contribute to phenotypic variance, demonstrating the complex association between ancestry and trait architecture. We next find that causal variant effect sizes for gene expression and blood lipid levels differ between European-Americans and African-Americans, even when accounting for differences in LD patterns and allele frequencies. These cross-population differences in causal variant effect sizes are likely due to gene-by-gene or gene-by-environment interactions, highlighting the role of genetic interactions in trait architecture. Furthermore, these differences may contribute to the poor portability of polygenic scores across populations, reinforcing the importance of conducting GWAS on individuals of diverse ancestries and environments.

Mitochondrial and sex chromosome genetically regulated gene expression implicates new genes in complex traits across multiple human populations.

Authors
D. Araújo, S. S. Rich, J. I. Rotter, H. Im, A. W. Manichaikul, H. E. Wheeler, NHLBI TOPMed Consortium;
Name and Date of Professional Meeting
American Society of Human Genetics, October 18-21, 2021
Associated paper proposal(s)
Working Group(s)
Abstract Text
The majority of GWAS are conducted in European ancestry populations and are limited to autosomal chromosomes, ignoring the genetic content of the mitochondria and sex chromosomes. Alongside GWAS, transcriptome-wide association studies (TWAS) can provide useful information about the direction of gene regulation underlying complex traits. Given the genetic diversity among individuals, we sought to build mitochondrial and sex chromosome transcriptome prediction models for use in TWAS in diverse populations, including those underrepresented in GWAS and TWAS.

We used transcriptome data from the Multi-Ethnic Study of Atherosclerosis (MESA) comprised of up to 1004 individuals of African, Chinese,
European and Hispanic/Latino ancestries. For each of 3 blood cell types, peripheral blood mononuclear cells (PBMC), CD16+ monocytes, and
CD4+ T cells, we built models in each population and also a model including all individuals. We used cross-validated elastic net to estimate gene expression from local SNPs within 1Mb of each gene through an additive linear model. Depending on population, our modeling resulted in 24-57 genes with Spearman correlation ρ>0.1. Smaller sample sizes were available for monocytes and T cells, resulting in 3-16 and 4-13 genes with ρ>0.1, respectively. Most predicted genes were on the X chromosome, while few Y chromosome and mitochondrial genes had ρ>0.1. With these prediction models, we applied S-PrediXcan to X chromosome GWAS summary statistics from two different multi-ancestry studies, the Population Architecture using Genomics and Epidemiology (PAGE) study (n=49,839) and Pan UK Biobank (PanUKB, n=488,377). We identified 5 gene-trait pairs that were significant in both studies (P<0.05 after Bonferroni correction for the number of genes in each model): GRIPAP1 associated with diastolic blood pressure, STARD8 with platelet count, PLXNA3 with triglyceride levels, and both TSC22D3 and SPIN2B with height.

Of these 5 gene-trait pairs, only STARD8 - platelet count association had been reported previously; thus, the remaining 4 correlations may be
novel. For the mitochondrial genes, GWAS summary statistics were only available from the UK Biobank. We identified statistically significant
correlations between MT-ND3 and mean corpuscular hemoglobin, mean corpuscular volume, mean platelet volume, plateletcrit, red blood cell
count and red blood cell width distribution (P<3.3x10-7). We expect that conducting more integrative omics studies that include mitochondria
and sex chromosomes in multi-ethnic cohorts will identify new gene-trait associations and promote diversity in biomedical research.

Protein prediction for trait mapping in diverse populations

Authors
R. Schubert, I. Gregga, E. Geoffroy, A. Mulford, A. W. Manichaikul, H. Im, H. E. Wheeler, NHLBI TOPMed Consortium
Name and Date of Professional Meeting
American Society of Human Genetics, October 18-21, 2021
Associated paper proposal(s)
Working Group(s)
Abstract Text
Genetically regulated gene expression has helped elucidate the biological mechanisms underlying complex traits and similar interrogation of theproteome is now possible. Here, we used the Trans-omics for Precision Medicine (TOPMed) Multi-omics pilot study, which comprises data fromparticipants in the Multi-Ethnic Study of Atherosclerosis (MESA) cohort, to optimize genetic predictors of the plasma proteome for geneticallyregulated proteome association studies.

For 1305 proteins measured by a SOMAscan assay, we compared predictive models built via baseline elastic net regression to models integratingposterior inclusion probabilities estimated by fine-mapping SNPs prior to elastic net. In order to investigate the transferability of predictivemodels across ancestries, we built protein prediction models in five race/ethnic groups from MESA: African American (AFA, n = 183), Chinese(CHN, n = 71), European (EUR, n = 416), Hispanic/Latino (HIS, n = 301), and all populations combined (ALL, n=971).
We successfully built predictive models for 1187 unique proteins (R>0.1). As expected, fine-mapping produced more protein prediction models.Despite differences in sample size, EUR, HIS, and AFA training populations produced comparable numbers of predictive models. We usedINTERVAL (n=3,301), a European ancestry study for out of sample estimation of model performance. For the proteins predicted by both the ALLand EUR training populations in INTERVAL, the ALL population predicted better than EUR with both the baseline (p=0.0012) and fine-mapped(p=0.0064) model building strategies. At current training population sample sizes, performance between baseline and fine-mapped proteinprediction models was similar.

Using GWAS summary statistics from the Population Architecture using Genomics and Epidemiology (PAGE) study, which comprises ∼50,000Hispanic/Latinos, African Americans, Asians, Native Hawaiians, and Native Americans, we applied S-PrediXcan to perform proteome associationstudies for 28 complex traits. The most protein-trait associations were discovered, colocalized, and replicated using proteome model trainingpopulations with similar ancestries to PAGE (i.e. predominantly African American and Hispanic). These 21 distinct associations provide moreevidence that the SNPs at the locus are acting through protein abundance regulation to affect the associated phenotype and include: HP and ApoE associated with cholesterol traits and CRP, IL-1Ra, IL-6 sRa, and Apo E associated with C-reactive protein. More omics data in diversepopulations are needed to better understand the mechanisms underlying complex traits in all populations.
Author Disclosure Information:
Back to top