Research ArticleMOLECULAR BIOLOGY

Genomic profiling of native R loops with a DNA-RNA hybrid recognition sensor

See allHide authors and affiliations

Science Advances  17 Feb 2021:
Vol. 7, no. 8, eabe3516
DOI: 10.1126/sciadv.abe3516
  • Fig. 1 Generation of sensors for specific recognition of DNA-RNA hybrids.

    (A) Schematic depiction of the domain structure of RNase H1 protein. The HBD domain of RNase H1 is responsible for the specific recognition of the DNA-RNA hybrids (22). GST-His6-HBD and GST-His6-2×HBD expression constructs are shown below. (B) Analysis of the purified GST-His6-HBD and GST-His6-2×HBD proteins by SDS–polyacrylamide gel electrophoresis (PAGE) and Coomassie blue staining. (C to G) EMSAs showing GST-His6-2×HBD prefers the DNA-RNA hybrid (C), compared to ssDNA (D), dsDNA (E), ssRNA (F), and dsRNA (G). Fluorescent probes (30 nM) were incubated with increasing concentrations of GST-His6-2×HBD (2×HBD) as the indicator for binding. The complexes were resolved with a 6% native polyacrylamide gel and were imaged with a Typhoon FLA-9500. GST-His6-2×HBD: DNA-RNA hybrid complexes are indicated by a bracket. (H) Biolayer interferometry assay of DNA-RNA hybrid and GST-His6-2×HBD. Biotinylated DNA-RNA hybrid was immobilized on streptavidin biosensors and incubated with a range of GST-His6-2×HBD (from 6.25 to 200 nM) to measure the response in an Octet Red96 instrument. (I) EMSAs analysis of GST-His6-2×HBD with probes of different GC contents.

  • Fig. 2 Well-correlated profiles of GST-His6-2×HBD and S9.6 in DRIPc-seq–based R loop analysis.

    (A) Schematic presentation of DRIPc-seq with GST-His6-2×HBD protein. (B and C) UCSC genome browser tracks of 2×HBD-DRIPc-seq, DRIPc-seq (GSE102474) (34), PRO-seq, and TT-seq reads density at the NCK2, UXS1 (B), MRPS9, and POU3F3 (C) loci. Read density was normalized by reads per million (r.p.m.). (D) Heatmap and metagene plots of 2×HBD-DRIPc-seq, the published DRIPc-seq, PRO-seq, and TT-seq signals in the plus and minus strands. (E) Scatter plot of the 2×HBD-DRIPc-seq counts and S9.6–DRIPc-seq counts with all of the protein-coding genes. The Pearson correlation coefficient is shown. (F) The genome-wide Pearson correlation heatmap of 2×HBD-DRIPc-seq, S9.6-DRIPc-seq, and TT-seq showing densities within all protein-coding genes.

  • Fig. 3 Establishment of the R loop CUT&Tag for native R loop mapping.

    (A) Overview of the R loop CUT&Tag workflow. Cells were immobilized on concanavalin A (ConA)–coated magnetic beads, followed by cell permeabilization. GST-His6-2×HBD or S9.6 is used to recognize the R loops in the presence or absence of RNase A. Anti-GST, anti-HisTag, or secondary antibodies were applied to enhance the tethering of pA-Tn5 transposome at the GST-His6-2×HBD or S9.6-bound sites. After extensive wash, the pA-Tn5 transposome is activated to integrate the adapters on the chromatin. (B) CUT&Tag library preparation with Bst 2.0 WarmStart and Q5 polymerase. Strand displacement was performed with Bst 2.0, followed by library amplification with Q5 DNA polymerase. (C) Three different approaches for R loop CUT&Tag analysis. (D) LabChip analysis of R loop CUT&Tag library demonstrating the library size ranges from 220 to 700 bp with an average size of 405 bp. UM, upper marker; LM, lower marker. (E) Alignment rates of R loop CUT&Tag reads to the human hg38 and E. coli spiked-in genomes. RNase A treatment markedly decreases the alignment rates of CUT&Tag reads to the human genome, suggesting the specificity of GST-His6-2×HBD and S9.6 on R loop recognition. (F) UCSC genome browser tracks of CUT&Tag signals at the NPM1 and YY1AP1 loci. The tracks were normalized by reads per million, and the RNase A–treated groups were further normalized with the E. coli spike-in control.

  • Fig. 4 Characterization of native R loops by CUT&Tag with three different approaches.

    (A) Analysis of R loop CUT&Tag signals at all of the peaks from GST-His6-2×HBD and S9.6 CUT&Tag. RNase A digestion markedly reduced the CUT&Tag signals at those peaks, suggesting great specificity of GST-His6-2×HBD and S9.6 with R loops. (B) Heatmap profiles of CUT&Tag signals with or without RNase A treatment. (C) Annotation of CUT&Tag peaks showing the localization of the majority of R loops at the promoter regions. The genomic features are shown on the right. UTR, untranslated region. (D) Violin plot of CUT&Tag peak width with three different approaches. Wilcoxon test was used to test the statistical differences. CUT&Tag analysis with anti-HisTag antibody and GST-His6-2×HBD provides a superior resolution of R loop mapping. (E and F) Scatter plots of the log2 fold changes of R loop signals detected by anti- HisTag with RNase A (+/−) versus the log2 fold changes of anti-GST and RNase A (+/−) (E) or log2 fold changes of S9.6 and RNase A (+/−) (F). CUT&Tag analysis with anti-HisTag antibody and GST-His6-2×HBD is the most specific approach for R loop mapping. (G to I) Scatter plots of CUT&Tag signals from three different approaches. Pearson correlation was performed, and the r values are shown.

  • Fig. 5 R loop CUT&Tag signals are sensitive to RNase H digestion.

    (A to D) UCSC genome browser tracks of CUT&Tag signals at the NPM1, RPL13A, YY1AP1, and FUS loci. The tracks were normalized by reads per million and the RNase H–treated groups were further normalized with the E. coli spike-in control. (E) Alignment rates of R loop CUT&Tag reads to the human hg38 and E. coli spiked-in genomes. Four-hour RNase H treatment markedly reduces the alignment rates of CUT&Tag reads to the human genome and increases the alignment rates of reads to E. coli spiked-in genomes. (F and G) Heatmap and metaplot analysis of R loop CUT&Tag signals at all of the peaks from GST-His6-2×HBD and S9.6 CUT&Tag. RNase H digestion markedly decreases the CUT&Tag signals at those peaks, demonstrating great specificity of GST-His6-2×HBD and S9.6 on R loop recognition. (H to J) Reproducibility of R loop CUT&Tag methods. Biological replicates were performed, and the Pearson correlation was calculated with the reads per million at R loop peaks.

  • Fig. 6 A systematic comparison of R loop CUT&Tag versus other conventional R loop mapping methods.

    (A and B) Track examples of HEK293T PRO-seq, GST-His6-2×HBD CUT&Tag, S9.6 CUT&Tag, MapR (26), R-ChIP (24), DRIPc-seq (34), and TT-seq signals at the HSPD1 (A) and GRK6 (B) loci. The reads were aligned to the human hg38 genome, and the signals were normalized by reads per million. (C) PCA plot showing R loop CUT&Tag, MapR, and R-ChIP clustered together. (D) Fingerprint plots of R loop CUT&Tag, MapR, and R-ChIP. w.r.t., with respect to. (E and F) Metaplots of signals detected by different R loop mapping methods, PRO-seq, and TT-seq around the 2-kb window of the TSSs and TESs. Strand-specific signals from PRO-seq, TT-seq, DRIPc-seq, and R-ChIP were used for plotting. (G) Heatmap analysis of PRO-seq, TT-seq, and R loop mapping methods at the TSS of transcriptionally active genes (the reads per million of PRO-seq signals at TSS, >1; n = 13,220). The heatmaps are sorted by the GST-His6-2×HBD CUT&Tag signals. R loop CUT&Tag assays, MapR, and R-ChIP have enrichment at the TSS, while DRIPc-seq does not show this trend. (H and I) Scatter plots of R loop CUT&Tag and MapR reads per kilobase, per million mapped reads (RPKM) values (H) or R-ChIP RPKM values (I) at TSS. The r values were calculated by Pearson correlation.

  • Fig. 7 R loop CUT&Tag sensitively detects signals at gene body and intergenic regions.

    (A) Scheme of calculation of signals at TSS and gene body. The signals were normalized by RPKM. (B) Box plots of RPKM values at the TSS and gene body from 13,181 transcriptional active genes. (C) Scatter plot of R loop CUT&Tag signals at the TSS and gene body showing that R loop CUT&Tag is capable of genome wide detecting the R loop in the gene body. (D and E) Scatter plots of MapR (D) and R-ChIP (E) RPKM signals at the TSS and gene body. (F) Heatmap plots of the 3769 genes with R loop signals at gene body (G and H) The R loop signals at gene body were negatively correlated with gene lengths. (I and J) The gene body R loop signals positively correlate with PRO-seq (I) and H3K36me3 (J) signals at the gene body. (K) Gene ontology (GO) analysis of the 3769 genes indicates that R loop may be involved in the regulation of various key biological processes. (L) Track examples of HEK293T PRO-seq, GST-His6-2×HBD CUT&Tag, and S9.6 CUT&Tag signals at the YY1 and ZNF557 genomic loci. The reads were normalized by reads per million, and the enhancers are indicated. (M) Heatmap analysis of R loop CUT&Tag signals at 3830 intergenic regions. The heatmaps were sorted by the GST-His6-2×HBD CUT&Tag signals, and the H3K27ac signals in HEK293T are shown. H3K27ac, histone 3 lysine 27 acetylation.

  • Fig. 8 R loop signals are affected by ex vivo and in situ detecting strategies.

    (A) Workflow of DRIPc-seq with GST-His6-2×HBD or S9.6 combined with random fragmentation of genomic DNA by NEB dsDNA fragmentase. IP, immunoprecipitation. (B) Genome browser tracks of DRIPc-seq and R loop CUT&Tag coverage at the FUS and RPL13A loci detected by GST-His6-2×HBD and S9.6. Signals were normalized by reads per million. (C and D) Heatmap analysis of DRIPc-seq (ex vivo) and R loop CUT&Tag (native, in situ) at all the protein-coding genes by GST-His6-2×HBD (C) or S9.6 (D). (E and F) Metagene plots of DRIPc-seq and R loop CUT&Tag by GST-His6-2×HBD (E) or S9.6 (F).

Supplementary Materials

  • Supplementary Materials

    Genomic profiling of native R loops with a DNA-RNA hybrid recognition sensor

    Kang Wang, Honghong Wang, Conghui Li, Zhinang Yin, Ruijing Xiao, Qiuzi Li, Ying Xiang, Wen Wang, Jian Huang, Liang Chen, Pingping Fang, Kaiwei Liang

    Download Supplement

    The PDF file includes:

    • Figs. S1 to S4

    Other Supplementary Material for this manuscript includes the following:

    Files in this Data Supplement:

Stay Connected to Science Advances

Navigate This Article