Skip to main content

Multi-study pQTL analysis of Somascan proteomics in multi-ancestry TOPMed Cohorts

Authors
Catherine L. Debban, Usman Tahir, Katherine Pratte, Jennifer A. Brody, Mikyeong Lee, Claire Guo, Andrew Hill, Jayna Nicholas, Daniel H Katz, Bing Yu, James G. Wilson, Honghuang Lin, Katerina Kechris, Sina A. Gharib, Stephen S. Rich, Kent Taylor, Michael H. Cho, Jerome I Rotter, Bruce Psaty, Stephanie J London, Robert Gerszten, Laura Raffield, Russell P. Bowler, Ani Manichaikul
Name and Date of Professional Meeting
ASHG Meeting (November 1-5 2023)
Associated paper proposal(s)
Working Group(s)
Abstract Text
Integration of genome-wide association study (GWAS) with gene expression quantitative trait loci (eQTL) has proven a valuable approach as a first step to identifying molecular mechanisms underlying GWAS signals. However, many GWAS loci do not show evidence of colocalization with eQTLs. Motivated by the hypothesis that high-throughput proteomics can complement eQTLs for enhanced interpretation of GWAS signals, we assembled a pQTL resource by combining SomaScan proteomics versions with 1.3k, 5k and 7k aptamers measured from four community-based cohorts (Cardiovascular Health Study [CHS], Framingham Heart Study [FHS], Jackson Heart Study [JHS], Multi-Ethnic Study of Atherosclerosis [MESA]; total n=8,200), one smoking-enriched cohort (COPDGene; n=5,000), and one asthma-enriched cohort (the Agricultural Lung Health Study [ALHS]; n=1,830). The combined set of proteomics measures reflects multi-ancestry individuals representing European Americans (EUR; n=7,470) and African Americans (AFA; n=7,200) with 1,300-7,000 protein aptamer measures per sample (depending on the SomaScan version). We leveraged whole genome sequence data for the TOPMed cohorts (CHS, FHS, JHS, MESA and COPDGene) and genome-wide imputation from TOPMed for ALHS to peform pQTL mapping. We found that that accounting for unknown sources of variance by including PEER factors or PCs of hidden variance as covariates improves detection of pQTLs, with the PCs achieving similar results at far lower computational burden. Thus far, preliminary analysis of selected proteins with data from all studies, adjusting for age, sex, PCs of ancestry, and PCs of hidden variance recapitulates known variant-expression associations such as the known SERPINA1 S and Z alleles for alpha-1 antitrypsin levels (AAT) levels. In a subset of 25 proteins on chromosome 21, we detected cis-pQTLs for 52% of proteins, and trans-pQTLs for 44% of proteins. In analysis stratified by race/ancestry, we observed a greater number of protein-associated signals in AFA compared to EUR, likely reflecting differences in patterns of linkage disequilibrium and deeper variation in the African ancestry populations. We are currently expanding our analysis genome-wide. Our pQTL mapping effort leveraging high-throughput proteomics demonstrates the value of integrating multi-ancestry samples to expand the set of protein-associated variants and identify putative molecular mechanisms underlying GWAS signals.
Back to top