TMF: Improve polygenic risk scores from sequencing data and diverse populations
Principal Investigator: Zhe Wang, PhD
Project Title: Improve polygenic risk scores from sequencing data and diverse populations
Abstract: The key goal of this project is to enhance polygenic risk scores (PRSs) by utilizing rare variants, sequencing data, and diverse populations. It will provide a user-friendly workflow for building more comprehensive PRSs. Height and body mass index (BMI) are model complex traits with relatively high heritability and polygenicity. In the GWAS era, height and BMI have been broadly studied to address questions of heritability, genetic architecture, and polygenic prediction [1-4]. Because of the simple measurement, they are also the traits with largest sample sizes available. Therefore, height and BMI act as good templates to study for the study of other polygenic traits and diseases.
PRSs, which are calculated by computing the sum of an individual's risk alleles, weighted by GWAS-estimated effect size, have become promising tools to predict complex traits and diseases [5]. However, standard PRSs (1) use common variants only (minor allele frequency MAF ≥ 1%) only, (2) do not consider rare pathogenic variants, or only consider one or few monogenetic genes, and (3) are derived from mostly European ancestry individuals. These factors result in a key loss of information about an individuals’ genetic profile, especially for non-European individuals. To address these gaps in PRS calculation, I propose to leverage the sequencing data from the TOPMed program, the BioMe Biobank, and the UK Biobank to compute and evaluate PRSs that (1) aggregate both common and rare (MAF < 1%) variants, (2) incorporate pathogenic variants likely to affect BMI/height, (3) are computed from diverse, global samples to boost their accuracy in non-Europeans