Research ArticleHUMAN GENETICS

Inference and analysis of population-specific fine-scale recombination maps across 26 diverse human populations

See allHide authors and affiliations

Science Advances  23 Oct 2019:
Vol. 5, no. 10, eaaw9206
DOI: 10.1126/sciadv.aaw9206
  • Fig. 1 Accuracy of inference on simulated and real data.

    (A) Spearman correlation between inferred and true maps for 100 simulations, each 1 Mb long, for both pyrho and LDhat, with our method showing improved performance especially at finer scales. (B) Our inferred recombination maps provide a better fit to observed patterns of linkage disequilibrium as measured by r2. For a pair of SNPs, r2 is a random quantity and depends on the rate of recombination between the SNPs. Solid lines show theoretical deciles of this distribution for pairs of sites separated by different recombination distances with MAF >0.1 at both sites as calculated under the population size for YRI in Fig. 2A. Shaded points are the deciles of the empirical distribution obtained by considering pairs of sites with MAF >0.1 binned by the recombination rate separating them according to the different recombination maps. 1KG YRI is the population-specific recombination map for YRI in (15); DECODE is the sex-averaged recombination map in (28); and HapMap is the recombination map in (34).

  • Fig. 2 Interplay of demographic history and fine-scale recombination rates.

    (A) Population sizes as inferred by smc++. All non-African populations show an out-of-Africa bottleneck, which is deepest in East Asian populations. (B) Heatmap of the Spearman correlation between the inferred recombination maps at single–base pair resolution. All maps show a high degree of correlation, yet the relative correlations agree with continental levels of population differentiation. (C) Recombination rates at different PRDM9 binding motifs in each population, normalized by the log average recombination rate in a shuffled version of that motif. PRDM9-A binding motifs show consistent recombination rates across all populations, while PRDM9-C binding motifs show particularly elevated rates in African populations. Three-letter population codes are defined in Table 1.

  • Fig. 3 Gene conversion acts like weak selection to remove PRDM9 binding sites.

    Strength of selection acting against PRDM9-A binding alleles for different populations in different bins of recombination rate. Bin 1 is for per-generation rates, r ∈ [0,1.45 × 10−9); bin 2 is r ∈ [1.45 × 10−9,2.78 × 10−9); bin 3 is r ∈ [2,78 × 10−9,5.25 × 10−9); bin 4 is r ∈ [5.25 × 10−9,1.19 × 10−8); and bin 5 is r ∈ [1.19 × 10−8, ∞). Bins were chosen such that approximately the same number of polymorphic PRDM9 binding sites falls within each bin. Selection is stronger at bins with higher recombination rates.

  • Fig. 4 PRDM9 and chromatin structure shape fine-scale recombination rates.

    Different chromatin states have substantially different average recombination rates as determined by fitting a model using only chromatin state (Chromatin only), a model with independent chromatin state and PRDM9 binding effects (PRDM9 + Chromatin: Ind. Effects), and a model where PRDM9 binding may have a different effect in different chromatin states (PRDM9 + Chromatin: Dep. Effects). Sites characterized by H3K27me3 marks (bivalent states and regions repressed by Polycomb) have the highest recombination rates, while repetitive regions, transcribed regions, and heterochromatic or quiescent regions all have depressed recombination rates. ZNF, zinc finger genes; TSS, transcription start site.

  • Table 1 Populations in the 1KG dataset (15).

    Populations in the 1KG dataset (15).. The super-populations are African (AFR), admixed American (AMR) East Asian (EAS), European (EUR), and South Asian (SAS).

    Population
    code
    PopulationSuper-
    population
    code
    ACBAfrican Caribbeans in BarbadosAFR
    ASWAmericans of African
    Ancestry in SW USA
    AFR
    BEBBengali from BangladeshSAS
    CDXChinese Dai in Xishuangbanna, ChinaEAS
    CEUUtah residents (CEPH) with
    Northern and Western
    European ancestry
    EUR
    CHBHan Chinese in Beijing, ChinaEAS
    CHSSouthern Han ChineseEAS
    CLMColombians from Medllin, ColombiaAMR
    ESNEsan in NigeriaAFR
    FINFinnish in FinlandEUR
    GBRBritish in England and ScotlandEUR
    GIHGujarati Indian from Houston, TexasSAS
    GWDGambian in Western Divisions
    in the Gambia
    AFR
    IBSIberian population in SpainEUR
    ITUIndian Telugu from the
    United Kingdom
    SAS
    JPTJapanese in Tokyo, JapanEAS
    KHVKinh in Ho Chi Minh City, VietnamEAS
    LWKLuhya in Webuye, KenyaAFR
    MSLMende in Sierra LeoneAFR
    MXLMexican ancestry from
    Los Angeles, USA
    AMR
    PELPeruvians from Lima, PeruAMR
    PJLPunjabi from Lahore, PakistanSAS
    PURPuerto Ricans from Puerto RicoAMR
    STUSri Lankan Tamil from the
    United Kingdom
    SAS
    TSIToscani in ItaliaEUR
    YRIYoruba in Ibadan, NigeriaAFR

Supplementary Materials

  • Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/5/10/eaaw9206/DC1

    Fig. S1. Additional measures of accuracy on simulated data.

    Fig. S2. Goodness-of-fit of inferred recombination maps.

    Fig. S3. Recombination rates are more similar across populations than a measure of diversity.

    Fig. S4. Modulators of fine-scale recombination rates.

    Fig. S5. Interplay of background selection and inferred recombination rates.

    Fig. S6. Comparison of runtime for pyrho and LDhat.

    Fig. S7. Correlation between maps inferred by pyrho and maps inferred by previous methods.

    Fig. S8. Accuracy on simulations of pyrho using phased and unphased data.

    Table S1. Correlation on simulated data at different spatial resolutions.

    Table S2. Effect of genome build.

  • Supplementary Materials

    This PDF file includes:

    • Fig. S1. Additional measures of accuracy on simulated data.
    • Fig. S2. Goodness-of-fit of inferred recombination maps.
    • Fig. S3. Recombination rates are more similar across populations than a measure of diversity.
    • Fig. S4. Modulators of fine-scale recombination rates.
    • Fig. S5. Interplay of background selection and inferred recombination rates.
    • Fig. S6. Comparison of runtime for pyrho and LDhat.
    • Fig. S7. Correlation between maps inferred by pyrho and maps inferred by previous methods.
    • Fig. S8. Accuracy on simulations of pyrho using phased and unphased data.
    • Table S1. Correlation on simulated data at different spatial resolutions.
    • Table S2. Effect of genome build.

    Download PDF

    Files in this Data Supplement:

Stay Connected to Science Advances

Navigate This Article