Authors |
Y. Luo, M. Kanai, M. Gutierrez-Arcelus, J.G. Wilson, S. Kathiresan, J.I. Rotter, S.S. Rich, M.H. Cho, W.S. Choi, B. Han, Y. Okada, A. Metspalu, T. Esko, P.J. McLaren, S. Raychaudhuri, NHLBI TOPMed Consortium
|
Abstract Text |
The human leukocyte antigen (HLA) region harbors genes that are crucial to many human diseases. However, it remains a challenge to pinpoint the causal variants for these associations due to the extreme complexity of the region. We constructed the largest multi-ethnic HLA haplotype panel to date to better understand immune related adaptive evolution, and to facilitate fine-mapping studies from genome-wide association studies (GWAS).
First, we inferred HLA types at G-group resolution using whole-genome sequences (WGS) from 20,209 individuals of different ancestries (10,699 Europeans, 7,644 African Americans (AA), 1,016 Hispanics and 850 East Asians) using population reference graphs (Dilthey et al. 2016). We evaluated inferred HLA types against sequencing based typing (SBT) among 295 Japanese samples sequenced at 15x coverage. The accuracies for HLA-A, B, C, DQA1, DQB1 and DRB1 were 95.4%, 97.9%, 98.5%, 99.3%, 98.2% and 97.2% respectively. We observed high levels of differentiation in population allele frequencies among inferred HLA types (P = 8.7e-267). In many cases these differences are likely to be related to adaptive selection such as enrichment for the B*53:01:01G and C*04:01:01G alleles in AA samples that have been previously associated with malaria protection.
We next built a multi-ethnic HLA reference panel based on inferred HLA types and genetic variation in 5,376 multi-ethnic samples. To evaluate the imputation accuracy of the multi-ethnic panel, we compared imputed HLA haplotypes against SBT among 1,067 AA subjects. The average accuracy among the six classical HLA genes was 96.3%, compared to 77.4% when using a European panel of 5,225 samples alone. To illustrate fine-mapping advantages due to increasing ancestral diversity in the reference panel, we meta-analyzed published HIV-1 virus load GWAS in a total of 6,315 European and 2,924 AA subjects. The most significantly associated allele was an amino acid at HLA-B position 97 in both populations. However, conditional analysis identified different secondary associations at an amino acid at HLA-B position 67 and B*08:01:01G for the European and AA respectively.
These results highlight the benefits of a multi-ethnic reference panel for the discovery and characterization of HLA-disease associations. In the next phase, we will build an HLA panel using >20,000 WGS. This resource will open an exciting opportunity to understand immune-related genetic architecture across populations of diverse ancestries.
|