Skip to main content

Metabolomics and Proteomics

Catalog and Genetic Architecture of Circulating Metabolites from Trans-Omics for Precision Medicine (TOPMed) Program

Authors
Nannan Wang1, Taryn Alkis1, Tom Blackwell2, Russell P. Bowler3, Clary B. Clish4, Anne M. Evans5, Robert E. Gerszten4, Megan L. Grove1, Scott R. Hutton5, Rachel S. Kelly6, Chales Kooperberg7, Martin G. Larson8, Deborah A. Meyers9, Laura M. Raffield10, Vasan S. Ramachandran11,12, Alexander P. Reiner7,13, Stephen S. Rich14,15, Jerome I. Rotter16, Edwin K. Silverman6, Albert V. Smith2, Jessica Lasky-Su6, Kari E. Wong5, NHLBI Trans-Omics for Precision Medicine (TOPMed) Metabolomics Working Group, Han Chen1,17 and Bing Yu1
Name and Date of Professional Meeting
ASHG Nov 2023
Associated paper proposal(s)
Working Group(s)
Abstract Text
Circulating metabolite levels reflecting the state of human health and disease can be impacted by genetic effects. The NHLBI Trans-Omics for Precision Medicine (TOPMed) Program has sponsored metabolomic measures in ~100,000 samples across multiple studies to promote discovery of causal molecular pathways and therapeutic targets. We initiated a standard operating procedure (SOP) to harmonize metabolite data across TOPMed studies. In Phase 1, we catalogued 1,730 circulating metabolites from two metabolomics cores (25,058 samples; 53% females) and made them accessible through TOPMed portal.
Metabolite levels are heritable. However, their genetic architectures are not fully understood, including the generalizability of findings from European ancestry dominant studies, and the identification of sex-specific metabolic signatures. Whole genome sequencing (WGS) data were available in 16,359 samples (54% females) who had metabolite data from eight studies, including African, Asian, European, and American ancestries. We performed single variant analyses (minor allele frequency &gt 0.5%) on 1,135 circulating metabolites (missing rate &lt 50%), using sex-pooled and sex-stratified approaches by GMMAT pipeline on BioData Catalyst (BDC).
We discovered 147,160 variant-metabolite pairs of associations (1,429 independent loci across 667 metabolites with P &lt 4.4x10-11). Among the associations mapped to well-known genes, four significant loci (CPS1, ALDH1L1, PSPH, GCSH) play critical roles on glycine metabolism. We also identified potential novel loci that require further investigation, e.g., SLC22A24 was associated with 11-beta-hydroxyetiocholanolone glucuronide levels. We observed sex-specific genetic associations in ~10% metabolites. Sex-stratified analysis identified 2,414 variant-metabolite pairs involving 194 independent loci and 74 metabolites (at a Bonferroni-corrected P = 2.5x10-4). We confirmed that CPS1 has a stronger effect on glycine levels in females than males. We also identified potential novel sex-specific loci with genetic effects in only one sex group, e.g., GSPT1 for N-acetylglycine in females, ABCC1 for Glutarylcarnitine (C5-DC) in males. We also detected potential novel association on chromosome X, i.e., ARSD for ascorbic acid 3-sulfate in females. The analytical pipeline is accessible through BDC, while sex-pooled and sex-stratified summary statistics are accessible through dbGaP Exchange Area.
In summary, we created a catalog for TOPMed metabolomics data and identified potential novel sex-pooled and sex-specific genetic associations contributing to our understanding of human circulating metabolites.

Practical recommendations for TOPMed metabolomics data

Authors
Franklin Ockerman, Laura Zhou, Emily Drzymalla, Taryn Alkis, Megan Grove, Bing Yu, Laura Raffield
Name and Date of Professional Meeting
American Society of Human Genetics Annual Meeting (November 1-5, 2023)
Associated paper proposal(s)
Working Group(s)
Abstract Text
The Trans-Omics for Precision Medicine (TOPMed) program expects to soon release over 90,000 samples with broad-spectrum metabolomic data, representing over a dozen studies. However, investigators using this resource face potential challenges in pre-processing and integrating data across studies. Differing metabolomic platforms and analysis centers may cause technical variation. Likewise, missing metabolite values may vary in their distribution and source between studies. Consistent protocols for pre-processing and integration are thus necessary to unlock the potential of this rich resource. We compare several strategies and offer recommendations for the TOPMed community, with the goal of guiding and facilitating future genetic and phenotype-specific analyses.
As a pilot phase, we are currently analyzing data from 25,058 participants from diverse case-control and population-based cohort studies, including 15,633 participants from 3 cohort studies on the Metabolon platform and 9,425 participants from 5 cohort studies on the Broad/BIDMC platform. This dataset includes 1730 named metabolites, including 364 metabolites measured in at least some cohorts across both platforms. With within-study rank-based inverse normal transformation, we demonstrate that estimates of age-metabolite associations are highly concordant (r > 0.999), and generally consistent with the existing literature, between pooled and inverse variance meta-analyzed data, although 36 metabolites are significant only in the meta-analysis. Most named metabolites had very low missingness in our dataset, and we found that metabolite associations with age and sex were highly consistent across all missingness imputation strategies (zero, min, half-min, k-nearest neighbors, random forest, quantile regression imputation of left censored data). We recommend replacing missing values with zero in metabolites characterized as xenobiotics. For other metabolites, we will compare imputation strategies with an analysis of metabolite quantitative trait loci (mQTLs).
In summary, we find largely consistent results in pooled and inverse variance meta-analysis. We recommend inverse-normal transformation to enable integration between studies. We recommend left-censored imputation for xenobiotics and will soon release recommendations for imputation in other metabolites. To aid investigators, we will release scripts for implementing these recommendations. Such pre-processing steps are necessary to optimize power in cross cohort metabolomic analysis, including planned QTL studies.

Catalog and Genetic Architecture of Circulating Metabolites from Trans-Omics for Precision Medicine (TOPMed) Program

Authors
Nannan Wang, Taryn Alkis, Tom Blackwell, Russell P. Bowler, Clary B. Clish, Anne M. Evans, Robert E. Gerszten, Megan L. Grove, Scott R. Hutton, Rachel S. Kelly, Chales Kooperberg, Martin G. Larson, Deborah A. Meyers, Laura M. Raffield, Vasan S. Ramachandran, Alexander P. Reiner, Stephen S. Rich, Jerome I. Rotter, Edwin K. Silverman, Albert V. Smith, Jessica Lasky-Su, Kari E. Wong, NHLBI Trans-Omics for Precision Medicine (TOPMed) Metabolomics Working Group, Han Chen, Bing Yu
Name and Date of Professional Meeting
ASHG Meeting (November 1-5, 2023)
Associated paper proposal(s)
Working Group(s)
Abstract Text
Circulating metabolite levels reflecting the state of human health and disease can be impacted by genetic effects. The NHLBI Trans-Omics for Precision Medicine (TOPMed) Program has sponsored metabolomic measures in ~100,000 samples across multiple studies to promote discovery of causal molecular pathways and therapeutic targets. We initiated a standard operating procedure (SOP) to harmonize metabolite data across TOPMed studies. In Phase 1, we catalogued 1,730 circulating metabolites from two metabolomics cores (25,058 samples; 53% females) and made them accessible through TOPMed portal.
Metabolite levels are heritable. However, their genetic architectures are not fully understood, including the generalizability of findings from European ancestry dominant studies, and the identification of sex-specific metabolic signatures. Whole genome sequencing (WGS) data were available in 16,359 samples (54% females) who had metabolite data from eight studies, including African, Asian, European, and American ancestries. We performed single variant analyses (minor allele frequency ≥ 0.5%) on 1,135 circulating metabolites (missing rate <50%), using sex-pooled and sex-stratified approaches by GMMAT pipeline on BioData Catalyst (BDC).
We discovered 147,160 variant-metabolite pairs of associations (1,429 independent loci across 667 metabolites with P <4.4x10-11). Among the associations mapped to well-known genes, four significant loci (CPS1, ALDH1L1, PSPH, GCSH) play critical roles on glycine metabolism. We also identified potential novel loci that require further investigation, e.g., SLC22A24 was associated with 11-beta-hydroxyetiocholanolone glucuronide levels. We observed sex-specific genetic associations in ~10% metabolites. Sex-stratified analysis identified 2,414 variant-metabolite pairs involving 194 independent loci and 74 metabolites (at a Bonferroni-corrected P = 2.5x10-4). We confirmed that CPS1 has a stronger effect on glycine levels in females than males. We also identified potential novel sex-specific loci with genetic effects in only one sex group, e.g., GSPT1 for N-acetylglycine in females, ABCC1 for Glutarylcarnitine (C5-DC) in males. We also detected potential novel association on chromosome X, i.e., ARSD for ascorbic acid 3-sulfate in females. The analytical pipeline is accessible through BDC, while sex-pooled and sex-stratified summary statistics are accessible through dbGaP Exchange Area.
In summary, we created a catalog for TOPMed metabolomics data and identified potential novel sex-pooled and sex-specific genetic associations contributing to our understanding of human circulating metabolites.
Back to top