Skip to main content

Diabetes

Leveraging T2D specific omics data in rare variant association analysis in TOPMed

Authors
Timothy Majarian, Paul S. de Vries, Deepti Jain, Brian Cade, and Alisa K. Manning on behalf of the TOPMed diabetes working group
Name and Date of Professional Meeting
American Society of Human Genetics
Associated paper proposal(s)
Working Group(s)
Abstract Text
In whole genome sequencing (WGS) studies of rare variants, multiple variants must be aggregated into units to have sufficient statistical power for association tests with type 2 diabetes (T2D). We incorporated T2D-relevant functional annotations in generating three separate gene-centric aggregation strategies as an alternative to agnostic, sliding window-based approaches. Using TOPMed WGS (44,732 individuals in 23 studies), T2D was defined by fasting glucose ≥ 7 mmol/L or HbA1c ≥ 6.5% or 2-hr OGTT ≥ 11.1 mmol/L or non-fasting glucose ≥ 11.1 mmol/L or treatment, physician diagnosis, self-report, resulting in 9,651 T2D cases and 35,081 controls from 4 ancestries: African American, Asian, European, and Hispanic. We tested association of each aggregation unit using sequence kernel association tests (SKAT) as implemented in the GENESIS package. P-value based meta-analyses of summary statistics were also performed. We used Bonferroni correction to determine statistical significance within each of 3 gene-centric grouping strategies. We defined aggregation units with islet-specific expressed genes from previously published RNA-seq data of 89 individuals with and without T2D (aggregation 1). Each unit included regulatory or high-impact, protein truncating variants. Promoter and enhancer regions were derived from chromatin state predictions, filtered by predicted transcription factor binding sites, and linked to genes by distance and publicly available databases. Two gene-centric coding variant aggregation strategies were also used: high-impact, protein truncating variants (2) and high to moderate-impact variants, protein truncating and missense variants (3). Our results show 8,922 islet expressed genes with cumulative minor allele count greater than 10, comprised of 1 million variants. The TMEM35B aggregation unit was associated with T2D in the Asian ancestry analysis (P = 1.3E-6, (1)). High impact variant aggregation (2) yielded 18,420 tests. DNASE1 and EXOC6B units were associated with T2D in the Asian ancestry (P=1.4E-6, (2)). The ZNF454 aggregation unit was significant in meta-analysis (P=1.6E-7, (2)). 19,048 units were identified in high- and moderate-impact aggregation. MC4R was associated with T2D (P=2.6E-6, (3)) in the Asian ancestry analysis. In conclusion, with islet-specific rare variant aggregation, we identified sets of variants that were likely to be both functional and act on common gene targets in the setting of T2D.

Chromosome X association analysis of Hemoglobin A1c (HbA1c) in African Americans using TOPMed whole genome sequence (WGS) data

Authors
Chloé Sarnowski, Aaron Leong, Daniel DiCorpo, Laura Raffield, Xiuqing Guo, Paul S. de Vries for the TOPMed Diabetes Working Group
Name and Date of Professional Meeting
CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium) meeting, Baltimore, 11-12 Oct 2018
Associated paper proposal(s)
Working Group(s)
Abstract Text
Background: Using WGS association analysis of HbA1c (a test used to diagnose type 2 diabetes (T2D) and estimate glycemia) in 3,224 African-Americans from the Trans-Omics for Precision Medicine (TOPMed) program, we identified a chromosome X association in the G6PD locus (most associated SNP, rs1050828 (p.Val98Met), minor allele frequency (MAF)=0.12, β=-0.41, P=4.4x10-183). We sought to identify additional distinct or sex-specific associations in this region through conditional and sex-stratified analyses. Methods: Restricting the analyses to the ±500kb window flanking G6PD, we performed conditional analysis on rs1050828 using linear mixed-effect models adjusted for age at HbA1c measurement, study, sex, with an empirical kinship matrix to account for relatedness. Associations with P<1.7x10-5 (0.05/2,886 variants with a minor allele count > 20) were considered distinct from rs1050828. We then performed analyses in males (N=1,312) separately from females (N=1,912) and meta-analyzed the sex-stratified results. Analyses were conducted on the Analysis Commons. Results: In addition to rs1050828, we identified a distinct (r2=0.0006, D’=1) rare signal (rs76723693 (p.Leu353Pro), MAF=0.005, β=-0.44, P=5.4x10-9) which was more frequent in females (MAF_Females=0.006 vs. MAF_Males=0.003) and had a larger effect in males (β_Males=-0.49, P=2.4x10-5, β_Females=-0.38, P=1.2x10-4). Both rs1050828 and rs76723693 are missense and putative pathogenic (ClinVar). Heterogeneity between sex-stratified results was detected (Phet≤0.10) for 230 variants among 322 associated with HbA1c at P≤5x10-8 (rs1050828, β_Males=-0.43, P=2.6x10-148, β_Females=-0.36, P=1.0x10-69, Phet=2.9x10-18). Conclusion: Two previously reported G6PD coding variants (rs1050828 and rs76723693) are independently associated with lower HbA1c values and their associations differ by sex. More people than just rs1050828 carriers may be underdiagnosed for T2D.

Fine-mapping of type 2 diabetes and glycemic traits with whole genome sequence data using 49,022 individuals from the NHBLI’s TOPMed WGS Program

Authors
A.K. Manning1,2,3, D. Dicorpo4, J. Wessel5, TOPMed Diabetes Working Group
Name and Date of Professional Meeting
ASHG Conference (October 2018)
Associated paper proposal(s)
Working Group(s)
Abstract Text
Whole genome sequence (WGS) association studies afford the opportunity to perform trans-ancestry fine-mapping without depending on imputation. We have leveraged large, phenotypic-rich and ancestry-diverse cohorts from NHBLI’s Trans-Omics for Precision Medicine (TOPMed) WGS Program to refine credible sets and discover novel distinct associations with type 2 diabetes (T2D), and fasting glucose (FG) and fasting insulin (FI) levels. We initially focussed on loci with known associations with T2D and glycemic traits or genes involved in monogenic diabetes and insulin resistance syndromes, and then expanded to describe novel associations for which we are seeking additional support. We performed ancestry-specific genetic association analysis using GENESIS mixed models with common (minor allele frequency [MAF]>1%), low-frequency (0.01%<MAF<1%) and rare (MAF<0.01%) variants, correcting for relatedness and population structure with a genetic relationship matrix derived from pruned common variants. We used PAINTOR for fine-mapping, and further refined our credible sets by leveraging ancestry-specific linkage disequilibrium (LD) and regulatory and chromatin accessibility annotations from tissues shown to be enriched in common variant association signals: pancreatic islets for T2D, liver, adipose and muscle for FG/FI. Our analysis included data from 16 TOPMed projects and 5 ancestries. For the T2D analysis: European N=4,781 with T2D/21,365 without T2D; African-American: 3,783/9,470, Hispanic: 612/1628, Asian: 427/1,973, Samoan: 185/922). For FG/FI: European N=13,749 individuals without T2D; African-American: 7,256, Hispanic: 2,005, Asian: 2,235, Samoan: 922. For T2D, in ancestry-combined meta-analyses, 90 variants met genome-wide significance (P<5e-8), all common: 4 variants in SLC30A8, 76 variants at TCF7L2 of which 8 are multi-allelic variants, 2 variants at KCNQ1 and 8 variants at FTO of which 1 is a short insertion/deletion. A potentially novel association at MYO1F shows a rare variant signal specific to African-ancestry individuals (MAF=0.002; P=2e-10). For FG, significant associations were seen at 6 loci with previously reported signals: GCKR, G6PC2, GCK, SLC30A8, MTNR1B, and FOXA2, all common variants. Nominally significant (P<5e-6) low-frequency variant associations were seen at 6 loci: VPS13C, PRDM16, SLC2A1-AS1, INS, ACSL1, and CDKAL1. We soon will extend our analysis into the next release of WGS data from TOPMed (N~100,000 individuals).

An Omics Analysis, Search and Information System (OASIS) for Enabling Discovery in the TOPMed Diabetes Working Group

Authors
J.A. Perry, J.R. O’Connell on behalf of the TOPMed Diabetes Working Group
Name and Date of Professional Meeting
ASHG Conference (October 2018)
Associated paper proposal(s)
Working Group(s)
Abstract Text
Introduction: Members of the TOPMed Diabetes Working Group have computed over 200 million single-variant associations between TOPMed whole genomes and harmonized phenotypes for T2D, Fasting Glucose, Fasting Insulin and HbA1c. This massive quantity of results is typical of TOPMed research and transforming this “raw information” into “biological discovery” can be challenging. Scientist-friendly tools are needed to provide an integrated approach that includes data visualization, broad annotation, and fine-mapping techniques.

Methods: An Omics Analysis, Search and Information System (OASIS) was constructed for the TOPMed Diabetes Working Group by making major enhancements to an existing web-based application developed at the University of Maryland. The enhancements, which support the large TOPMed datasets, required redesign of the underlying database architecture and use of highly efficient data compression and data handling made available with the MMAP software (https://mmap.github.io/). The OASIS webserver is an approved “TOPMed Cloud Computing Platform” and thus allows Working Group analysts to directly upload association results for sharing and comparing alternative analyses. Multiple datasets can be created and access control features allow datasets to be selectively shared with other registered TOPMed OASIS users.

Results: OASIS datasets of up to 70 million associations from TOPMed freeze4 (219 million variants) and freeze5b (470 million) have been created for diabetes phenotypes. Boxplots split by genotype, study and ancestry are used to track populations for rare variants. Significant signals driven by a single study cohort have been identified. Variants are annotated by OASIS as they are reported during a user-initiated query and those lying in regulatory regions are easily spotted. Known-loci lists can be applied to query reports to identify variants in windows around known loci. Conditional and multi-covariate analysis is available for user-selected variants and integrated LocusZoom and Haploview plots provide linkage visualizations. Queries can be filtered by p-value, effect size, gene, variant type and/or function, rsid lists, genomic positions, and allele frequency.

Conclusion: Transforming massive volumes of TOPMed association results into “biological discovery” has been made dramatically easier. As a web-based tool, OASIS allows both analyst and non-analyst easy access to a wealth of annotation, visualization and fine-mapping techniques.

Chromosome X association analysis of Hemoglobin A1c (HbA1c) in African Americans using TOPMed whole genome sequence (WGS) data

Authors
Chloé Sarnowski, Aaron Leong, Daniel DiCorpo, Laura Raffield, Xiuqing Guo, Paul S. de Vries for the TOPMed Diabetes Working Group
Name and Date of Professional Meeting
IGES meeting, October 14-16, 2018, San Diego
Associated paper proposal(s)
Working Group(s)
Abstract Text
Background: Using WGS association analysis of HbA1c (a test used to diagnose type 2 diabetes (T2D) and estimate glycemia) in 3,224 African-Americans from the Trans-Omics for Precision Medicine (TOPMed) program, we identified a chromosome X association in the G6PD locus (most associated SNP, rs1050828 (p.Val98Met), minor allele frequency (MAF)=0.12, β=-0.41, P=4.4x10-183). We sought to identify additional distinct or sex-specific associations in this region through conditional and sex-stratified analyses.

Methods: Restricting the analyses to the ±500kb window flanking G6PD, we performed conditional analysis on rs1050828 using linear mixed-effect models adjusted for age at HbA1c measurement, study, sex, with an empirical kinship matrix to account for relatedness. Associations with P<1.7x10-5 (0.05/2,886 variants with a minor allele count > 20) were considered distinct from rs1050828. We then performed analyses in males (N=1,312) separately from females (N=1,912) and meta-analyzed the sex-stratified results. Analyses were conducted on the Analysis Commons.

Results: In addition to rs1050828, we identified a distinct (r2=0.0006, D’=1) rare signal (rs76723693 (p.Leu353Pro), MAF=0.005, β=-0.44, P=5.4x10-9) which was more frequent in females (MAF_Females=0.006 vs. MAF_Males=0.003) and had a larger effect in males (β_Males=-0.49, P=2.4x10-5, β_Females=-0.38, P=1.2x10-4). Both rs1050828 and rs76723693 are missense and putative pathogenic (ClinVar). Heterogeneity between sex-stratified results was detected (Phet≤0.10) for 230 variants among 322 associated with HbA1c at P≤5x10-8 (rs1050828, β_Males=-0.43, P=2.6x10-148, β_Females=-0.36, P=1.0x10-69, Phet=2.9x10-18).

Conclusion: Two previously reported G6PD coding variants (rs1050828 and rs76723693) are independently associated with lower HbA1c values and their associations differ by sex. More people than just rs1050828 carriers may be underdiagnosed for T2D.
Back to top