Research ArticleBIOCHEMISTRY

Measuring quantitative effects of methylation on transcription factor–DNA binding affinity

See allHide authors and affiliations

Science Advances  17 Nov 2017:
Vol. 3, no. 11, eaao1799
DOI: 10.1126/sciadv.aao1799
  • Fig. 1 Overview of Methyl-Spec-seq.

    (A) Schematic representation of the general workflow of Methyl-Spec-seq (see Materials and Methods). Briefly, differentially barcoded DNA libraries with variable regions are mixed and used in protein-DNA binding reactions. The DNA libraries are either treated with M.SssI methyltransferase enzyme to incorporate methyl-CpGs or left untreated and can also have synthetic 5′-methyl cytidine (mC). The letters “M” and “W” in red represent mC and mC on the complementary strand opposing a G, respectively. The protein-DNA complex is separated from the unbound DNA, following the binding reaction, in 9% polyacrylamide gel. The bound and unbound fractions are then polymerase chain reaction (PCR)–amplified (eight cycles) using Illumina-specific primers (text S1), and the resulting indexed samples are sequenced to generate energy logos for the binding sites. (B) Randomized double-stranded DNA (dsDNA) library used to measure the binding specificity of ZFP57 and the effect of methylation on binding. The full-length DNA libraries are shown in text S1. The regions highlighted in blue are the unique barcodes to distinguish the libraries during sequencing, whereas “N” in bold represent variable regions within the libraries. (C) Relative binding energy for all 64 variants in R2 libraries with different types of methylation, ranked from low to high binding energies of the unmethylated DNA. The relative binding energies are represented in units of kT, where k is the Boltzmann constant and T is the temperature used in the binding experiments. The 64 sequences of R2 libraries are listed vertically, and the relative binding energies depending on the methylation status are plotted. (D) Meth-eLogo based on the regression of the ZFP57 reference site and all its single variants. The significant effect of methylation at positions 4 and 5, which is the binding site for finger 3 (F3), is also shown. The effect of CpG methylation (mCPG) on binding specificity was calculated from the ePWM listed in fig. S2.

  • Fig. 2 Overview of 2color-CFA.

    (A) General workflow of 2color-CFA (see Materials and Methods). Briefly, DNAs with either presence or absence of mC are labeled with two different fluorophores, FAM and TAMRA, and mixed together. This mixture of DNAs is titrated with increasing protein concentration, and the fluorescence anisotropy of both fluorophores is measured and plotted. (B) Competitor oligos with different methylation states are labeled with FAM. The reference probe is duplex methylated and labeled with TAMRA. (C) FAM versus TAMRA anisotropy correspondence curves for different competitors and the reference sequence. The horizontal axis represents FAM anisotropy involving the protein-competitor complex, whereas the vertical axis represents TAMRA signal for the protein-reference complex. The inset shows the binding energy differences between the competitor sequences and the reference. Energy differences are computed from the natural log of the ratio in Eq. 3, obtained from the best linear fit to the data using only points with TAMRA anisotropy values <150.

  • Fig. 3 Methyl-Spec-seq analysis of CTCF.

    (A) Randomized dsDNA libraries for CTCF. The full-length libraries with the 5′ and 3′ flanking sites are shown in fig. S4. The 3′ internal barcodes are highlighted in blue. These DNAs were either methylated using CpG methyl transferase (M.SssI) or left untreated (Un) before mixing and used for binding assay (see Materials and Methods). (B) Comparison of binding energies between unmethylated (horizontal axis) and methylated (vertical axis) sites. The red circles denote CpG-containing sites in the libraries, whereas the blue circles represent sites that do not contain any CpG. The 0.25 kT energy deviation bounds are also shown in dashed lines. (C) Energy logo based on the CTCF reference site and all of its single nucleotide variants. The substantial effect of methylation at positions 2 and 3 is highlighted.

  • Fig. 4 Methyl-Spec-seq analysis of BATF1.

    (A) Cartoon of JunB and BATF1 protein binding dsDNA. (B) DNA libraries with 3-bp randomized regions for AP1. See text S1 for the full-length DNAs with flanking sites. (C) Binding energy logos for AP1 binding to 7-bp sites. (D) Binding energy logos for AP1 binding to unmethylated 8-bp sites. (E) Binding energy logo for 8-bp sites includes effects of methylation. All logos are based on consensus sites for the 7- and 8-bp sequences and their respective single-nucleotide variants. Only half-sites are shown with the assumption of symmetric binding sites.

  • Fig. 5 Methyl-Spec-seq analysis of GLI1.

    (A) Energy logo based on the regression of GLI1 binding sites. The methylation effect at positions 10 to 14 (binding site for fingers 2 and 3) is included. See text S1 for a list of libraries, with differential methylation profiles, used for binding studies. (B) Comparative genomics for protein patched homolog 1 (PTCH1) regulatory element bound by GLI1 and the adjacent bases that include a ZFP57 consensus sequence in the placental mammals. (C) The effect of methylation on two CpG loci inside the PTCH1 element. (D) Alignment scoring involving the PTCH1 adjacent regulatory element (B) and 50-bp flanking sites (human genome) against ZFP57 and GLI1 ePWMs, including the effect of methylation. See fig. S2 for the ePWMs. “-MW” refers to predictions on mCpG-containing DNA.

  • Fig. 6 HOXB13-binding specificity and methylation sensitivity.

    (A) The randomized library design for mouse HOXB13, with a total diversity of 100. Two-base barcodes were used to differentiate each methylation state. (B) For unmethylated DNA, an ePWM was generated from the binding energies of TCG and all single-base variants. From that ePWM, the binding energy is predicted for all unmethylated sequences. The predicted and measured energies are plotted. (C) Energy logo based on the regression of primary motif site TCG and all its single variants, with different methylation states included. (D) Energy logo based on the regression of secondary motif site CCA/CAA and all single variants, including methylation states. (E) All sequences, for each possible methylation state, within a range of 1.6 kT (either positive or negative) of the reference unmethylated TCG.

Supplementary Materials

  • Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/3/11/eaao1799/DC1

    fig. S1. General illustration of the use of M and W nomenclature to represent methylated bases in a DNA sequence.

    fig. S2. Methyl-Spec-seq ePWMs.

    fig. S3. Replicates of FAM and TAMRA anisotropy signals that were used to calculate the effect of mC on the relative binding specificity of ZFP57.

    fig. S4. The relative binding energies of all 64 variants (AP1 libraries) with different methylation profiles, ranked from the strongest (lowest energy) to the weakest binder of the unmethylated library.

    fig. S5. Replicate experiments with HOXB13.

    fig. S6. EMSA sample images for mouse ZFP57 (F1 to F3) and CTCF (F1 to F9).

    fig. S7. EMSA sample images for Gli1, JunB/BATF, and HOXB13.

    fig. S8. Schematic maps of plasmids used for cloning and expression of proteins.

    text S1. DNA oligo sequences for primers and libraries.

    text S2. Instructions for software use.

  • Supplementary Materials

    This PDF file includes:

    • fig. S1. General illustration of the use of M and W nomenclature to represent methylated bases in a DNA sequence.
    • fig. S2. Methyl-Spec-seq ePWMs.
    • fig. S3. Replicates of FAM and TAMRA anisotropy signals that were used to calculate the effect of mC on the relative binding specificity of ZFP57.
    • fig. S4. The relative binding energies of all 64 variants (AP1 libraries) with different methylation profiles, ranked from the strongest (lowest energy) to the weakest binder of the unmethylated library.
    • fig. S5. Replicate experiments with HOXB13.
    • fig. S6. EMSA sample images for mouse ZFP57 (F1 to F3) and CTCF (F1 to F9).
    • fig. S7. EMSA sample images for Gli1, JunB/BATF, and HOXB13.
    • fig. S8. Schematic maps of plasmids used for cloning and expression of proteins.
    • text S1. DNA oligo sequences for primers and libraries.
    • text S2. Instructions for software use.

    Download PDF

    Files in this Data Supplement:

Navigate This Article