Research ArticleGENETICS

DeepH&M: Estimating single-CpG hydroxymethylation and methylation levels from enrichment and restriction enzyme sequencing methods

See allHide authors and affiliations

Science Advances  01 Jul 2020:
Vol. 6, no. 27, eaba0521
DOI: 10.1126/sciadv.aba0521
  • Fig. 1 DeepH&M model.

    (A) Schematic explanations for the three main assays used for the DeepH&M model. (B) Structure of the DeepH&M model. DeepH&M is composed of three modules. CpG module takes inputs of genomic features and methylation features. DNA module processes raw DNA sequence data using a convolutional neural network. Joint module combines outputs from the CpG module and DNA module to predict 5hmC and 5mC simultaneously. Examples were given to show how 5hmC and 5mC were predicted from the three main assays. Conv is convolutional layer. Pool is pooling layer. Full con is full connected layer.

  • Fig. 2 Performance of the DeepH&M model in 7-week-old mouse cerebellum.

    (A) Density plots of predictions and gold standard data for 5hmC, 5mC, and total methylation. Pearson correlation coefficient is used as correlation metric. (B) Global distribution comparison of predictions and gold standard data for 5hmC, 5mC, and total methylation. (C) Concordance between predictions and gold standard data for 5hmC, 5mC, and total methylation at CpGs with differing 5hmC/5mC/total methylation levels. For 5hmC, 0.1 difference is used to calculate concordance. For 5mC and total methylation, 0.25 difference is used. Concordance for five ascending 5hmC windows and five ascending 5mC/total methylation windows is calculated to see how concordance distributes in differing 5hmC/5mC/total methylation levels. (D) Genome browser view of predictions and gold standard data for 7-week-old cerebellum at a representative locus.

  • Fig. 3 Factors affecting concordance between gold standard data and predictions.

    (A) Concordance for 5hmC/5mC/total methylation at different genomic features. (B) Comparison of gold standard 5hmC/5mC and predicted 5hmC/5mC at lowly methylated CGIs and highly methylated CGIs. CGIs are divided into lowly methylated CGIs (<0.2) and highly methylated CGIs (>0.7) based on their average total methylation levels. (C) Concordance for 5hmC/5mC/total methylation as a function of CpG coverage. For 5hmC concordance, CpG coverage is from TAB-seq data. For 5mC/total methylation concordance, CpG coverage is from WGBS data. (D) Concordance for 5hmC/5mC/total methylation as a function of CpG density.

  • Fig. 4 Performance of the DeepH&M model in 79-week-old mouse cerebellum.

    (A) Density plots of predictions and gold standard data for 5hmC, 5mC, and total methylation. (B) Global distribution comparison of predictions and gold standard data for 5hmC, 5mC, and total methylation. (C) Concordance between predictions and gold standard data for 5hmC, 5mC, and total methylation at CpGs with differing 5hmC/5mC/total methylation levels. (D) Genome browser view of a DHMR between 7- and 79-week-old cerebella. The selected box is the DHMR. The 5hmC changes at this region are supported by changes of gold standard 5hmC, predicted 5hmC, and also hmC-Seal signal between the two ages.

  • Fig. 5 DeepH&M can predict DHMRs and DMRs between 7- and 79-week-old mouse cerebella.

    (A) Distribution of mean 5hmC for gold standard data and predictions at hyperDHMRs and hypoDHMRs defined by hmC-Seal data between 7- and 79-week-old cerebella. Gold is for gold standard data. Pred is for prediction. N is the number. 7w, 7 weeks; 79w, 79 weeks. (B) Distribution of mean 5hmC + 5mC for gold standard data and predictions at hyperDMRs and hypoDMRs defined by WGBS data between 7- and 79-week-old cerebella. (C) Distribution of mean 5hmC for gold standard data and predictions at hyperDMRs and hypoDMRs defined by TAB-seq data between 7- and 79-week-old cerebella.

  • Fig. 6 Performance of the DeepH&M model in 7-week-old mouse cortex.

    (A) Density plots of predictions and gold standard data for 5hmC, 5mC, and total methylation. (B) Global distribution comparison of predictions and gold standard data for 5hmC, 5mC, and total methylation. (C) Concordance between predictions and gold standard data for 5hmC, 5mC, and total methylation at CpGs with differing 5hmC/5mC/total methylation levels.

Supplementary Materials

  • Supplementary Materials

    DeepH&M: Estimating single-CpG hydroxymethylation and methylation levels from enrichment and restriction enzyme sequencing methods

    Yu He, Hyo Sik Jang, Xiaoyun Xing, Daofeng Li, Michael J. Vasek, Joseph D. Dougherty, Ting Wang

    Download Supplement

    This PDF file includes:

    • Figs. S1 to S5
    • Tables S1 and S2

    Files in this Data Supplement:

Stay Connected to Science Advances

Navigate This Article