Research ArticleCHEMICAL PHYSICS

Deep neural network processing of DEER data

See allHide authors and affiliations

Science Advances  24 Aug 2018:
Vol. 4, no. 8, eaat5218
DOI: 10.1126/sciadv.aat5218
  • Fig. 1 Standard Tikhonov regularization processing, illustrated using site pair V96C/I143C in the lumenal loop of a double mutant of LHCII, with iodoacetamido-PROXYL spin labels attached to the indicated cysteines (64).

    For the primary data (top left), the zero time (green vertical line) is determined using moment analysis in the vicinity of the intensity maximum. The optimal starting time for background fitting (blue vertical line) is determined by minimizing probability density at the maximum distance. Data have been cut by 400 ns at the end (red vertical line) to minimize the influence of the artifact arising from overlapping pump and observe pulse excitation bands. The stretched exponential background fit is shown as a solid red line (where fitted) and as a dotted red line (where extrapolated). The background-corrected data (form factor, black) are shown in the top right panel together with fits using the regularization parameter corresponding to the origin distance criterion (red) and maximum curvature criterion (green). These two choices are also indicated in the L-curve (bottom left). The bottom right panel shows distance distributions computed with these two regularization parameters in matching color. Pastel background shading indicates distance ranges where the shape of the distribution is expected to be reliable (green), where mean distances and widths are expected to be reliable (yellow), where only mean distances are expected to be reliable (orange), and where data should not be interpreted (red). These ranges are derived from the duration of the primary data (7).

  • Fig. 2 One of the millions of synthetic DEER data sets, generated using Spinach (25) and used for neural network training in this work.

    (Left) Randomly generated distance distribution. (Right) The corresponding DEER form factor (purple), a randomly generated noise track (yellow), a randomly generated intermolecular background signal (red, marked BG), and the resulting “experimental” DEER signal (blue). a.u., arbitrary units.

  • Fig. 3 Schematic diagrams (produced by MATLAB) of the three types of neural network topologies explored in this work, using four-layer networks as an example.

    W block indicates multiplication by the weight matrix and b block indicates the addition of a bias vector. (Top) Fully connected full-width network. (Middle) Fully connected network with choke points. (Bottom) Functionally separated network with some layers explicitly dedicated to background rejection and others to interpretation—during the training process, the first output is the DEER form factor, and the second output is the distance probability density function.

  • Fig. 4 Distance distribution recovery performance illustration for a five-layer feedforward neural network, fully connected, with 256 neurons per layer.

    All inner layers have hyperbolic tangent transfer functions; the last layer has the strictly positive logistic sigmoid transfer function.

  • Fig. 5 DEER form factor recovery performance illustration for a six-layer feedforward neural network, fully connected, with 256 neurons per layer.

    All layers have hyperbolic tangent transfer functions.

  • Fig. 6 Singular values of the weight matrices in a six-layer feedforward neural network, fully connected, with 256 neurons per layer, and trained as described in the main text.

    All inner layers have hyperbolic tangent transfer functions; the last layer has the strictly positive logistic sigmoid transfer function.

  • Fig. 7 Performance of an ensemble of 100 five-layer neural networks on a previously unseen database.

    Each of the networks was started from a different random initial guess and trained in a different randomly generated database. Red dots indicate the good networks that are better than the median on both the mean relative error and the worst relative error. The blue asterisk is the performance of the average output of the good networks.

  • Fig. 8 Network ensemble performance illustration.

    Easy (left), tough (middle), and worst-case (right) agreement on the training set data. The variation in the outputs of different neural networks within the ensemble is a measure of the uncertainty in their output (63) when the training databases are comprehensive.

  • Fig. 9 A demonstration that deep neural networks learn to be Fredholm solvers rather than model fitters.

    Presenting a data set with four distances to networks trained on the database with at most three distances yields the right answer with high confidence. All networks in the ensemble return four peaks.

  • Fig. 10 Distance distributions obtained by Tikhonov regularization (blue lines) and uncertainties estimated by the DeerAnalysis validation tool (pink areas) for the six experimental test cases.

    (A) Site pair V96C/I143C in the lumenal loop of a double mutant of LHCII, with iodoacetamido-PROXYL spin labels attached to the indicated cysteines (64); (B) site pair S3C/S34C in the N-terminal domain of a double mutant of the LHCII monomers, with iodoacetamido-PROXYL spin labels attached to the indicated cysteines (64); (C) end-labeled oligo(para-phenyleneethynylene)—a rigid linear molecule described as compound 3a in (37); (D) [2]catenane (a pair of large interlocked rings) with a nitroxide spin label on each ring described as sample II in (65); (E) pairs of nitroxide radicals tethered to the surface of gold nanoparticles, with the thiol tether attachment points diffusing on the surface of the nanoparticle, sample Au3 after solvolysis and heating in (66); (F) rigid molecular triangle labeled with nitroxide radicals on two corners out of three, sample B11inv in (16).

  • Fig. 11 DEERNet performance on sample I: A site pair V96C/I143C in the lumenal loop of a double mutant of LHCII, with iodocateamido-PROXYL spin labels attached to the indicated cysteines (64).

    Residue 96 is located in the lumenal loop, and residue 143 is a structurally rigid “anchor” position in the protein core. In agreement with the results reported in the original paper, a bimodal distance distribution is measured—indicating flexibility in the lumenal loop. The low-confidence peak around 57 Å likely results from protein aggregation.

  • Fig. 12 DEERNet performance on sample II: A site pair S3C/S34C in the N-terminal domain of a double mutant of the LHCII, with iodoacetamido-PROXYL spin labels attached to the indicated cysteines (64).

    The data stem from LHCIII monomers. Residue 3 is located in the very flexible N-terminal region, while residue 34 is located in the structured part of the N-terminal domain.

  • Fig. 13 DEERNet performance on sample III: End-labeled oligo(para-phenyleneethynylene)—a rigid linear molecule described as compound 3a in (37).

    The maximum and the width of the distance distribution are in close agreement with the Tikhonov regularization results, whereas the expected skew of the distribution is not reproduced. Notably, there are no low-intensity artifacts that the Tikhonov method produces around the baseline.

  • Fig. 14 DEERNet performance on Sample IV: [2]catenane (a pair of large interlocked rings) with a nitroxide spin label on each ring.

    The distance distribution is in line with rough statistical estimates [Figure 5 in (65)], but there are fewer clumping artifacts compared to the output of the automatic Tikhonov regularization procedure. Within the Tikhonov framework, a manual regularization coefficient adjustment away from the corner of the L-curve is necessary to produce a distribution free of clumping artifacts.

  • Fig. 15 DEERNet performance on sample V: Pairs of nitroxide radicals tethered to the surface of gold nanoparticles, with the thiol tether attachment points diffusing on the surface of the nanoparticle (66).

    Note the markedly better performance relative to the Tikhonov method: The complete absence of clumping artifacts and the remarkable match to the analytical model—down to the maximum exhibited by the broad feature around 35 Å.

  • Fig. 16 Tikhonov distance distribution analysis for pairs of nitroxide radicals tethered to the surface of gold nanoparticles, with the thiol tether attachment points diffusing on the surface of the nanoparticle [sample Au3 after solvolysis and heating in (66)].

    Green lines correspond to a model fit assuming a Gaussian distribution of distances and a homogeneous distribution of the biradicals on spherical nanoparticles with a Gaussian distribution of radii. Blue lines correspond to Tikhonov regularization with the regularization parameter in the L-curve corner as suggested by DeerAnalysis. Red lines correspond to Tikhonov regularization with a larger regularization parameter corresponding to the second L-curve corner. (A) Fits of the background-corrected DEER data (black). (B) Distance distributions. (C) L-curve and the two points selected for Tikhonov distance distribution analysis.

  • Fig. 17 DEERNet performance on sample VI: A rigid molecular triangle labeled with nitroxide radicals on two out of three corners (16).
  • Fig. 18 A demonstration of exchange coupling resilience.

    The networks were trained on the database where each DEER trace has an exchange coupling randomly selected within the ±5-MHz interval (top row, J = –1.9 MHz; middle row, J = +2.9 MHz; bottom row, J = –3.6 MHz) and all other parameters as described in the “Training database generation” section. More than 99% of the training data set (including distributions with multiple distance peaks) produces the results of the kind shown in the top and middle panels—fast exchange oscillations are rejected and correct distance distributions are produced. With very noisy data (bottom), the networks duly report being highly uncertain.

  • Table 1 Training database generation parameters used in this work.

    Where a maximum value and a minimum value are given, the parameter is selected randomly within the interval indicated for each new entry in the database. Ranges in the suggested values indicate recommended intervals for the corresponding parameter.

    ParameterSuggested values
    Minimum distance in the distribution (Å)10–15
    Maximum distance in the distribution (Å)50–80
    DEER trace length (μs)2–5
    Minimum number of distance peaks1–2
    Maximum number of distance peaks2–3
    Data vector size256–1024
    RMS noise, fraction of the modulation depth0.05–0.10
    Minimum exchange coupling (MHz)−5.0
    Maximum exchange coupling (MHz)+5.0
    Minimum background dimensionality2
    Maximum background dimensionality3.5
    Minimum full width at half magnitude for
    distance peaks, fraction of the distance
    0.05–0.10
    Maximum full width at half magnitude for
    distance peaks, fraction of the distance
    0.20–0.50
    Maximum shape parameter (Eq. 11)+3.0
    Minimum shape parameter (Eq. 11)−3.0
    Minimum modulation depth0.05–0.10
    Maximum modulation depth0.50–0.60
    Minimum background decay rate (MHz)0.0
    Maximum background decay rate (MHz)0.5
  • Table 2 Performance statistics for a family of feedforward networks set up as a simple sequence of fully connected layers of the same width as the input vector.

    A schematic of the network topology is given in the top diagram of Fig. 3.

    TaskNetworkMean relative errorRelative error SDIteration time*, Tesla K40 (s)
    Distance distribution recoveryIn-(256)2-Out0.0900.2310.32
    In-(256)3-Out0.0770.2080.44
    In-(256)4-Out0.0700.1950.74
    In-(256)5-Out0.0690.1940.99
    In-(256)6-Out0.0690.1921.19
    Form factor recoveryIn-(256)2-Out0.00650.01430.31
    In-(256)3-Out0.00420.00940.51
    In-(256)4-Out0.00370.00840.75
    In-(256)5-Out0.00340.00800.98
    In-(256)6-Out0.00340.00801.18

    *Using a database with 100,000 DEER traces generated as described under “Training database generation” section.

    • Table 3 Performance statistics for a family of feedforward networks set up as a simple sequence of fully connected layers with a choke point in the middle.

      A schematic of the network topology is given in the middle diagram of Fig. 3.

      NetworkMean relative errorRelative error SDIteration time*, Tesla K40 (s)
      In-256-32-256-Out0.0950.2300.25
      In-256-64-256-Out0.0860.2170.29
      In-256-128-256-Out0.0840.2170.39
      In-256-256-256-Out0.0770.2080.51
      In-(256)2-32-(256)2-Out0.0900.2100.61
      In-(256)2-64-(256)2-Out0.0740.2010.65
      In-(256)2-128-(256)2-Out0.0730.2000.83
      In-(256)2-256-(256)2-Out0.0690.1940.99

      *Using a database with 100,000 DEER traces generated as described under “Training database generation” section.

      • Table 4 Performance statistics for a family of tailored networks composed of a group of form factor extraction layers that form the input of the interpretation layers.

        A schematic of the network topology is given in the bottom diagram of Fig. 3. FF, form factor; Int, interpretation.

        Network topologyInterpretationForm factor extraction
        Mean relative errorRelative error SDMean relative errorRelative error SD
        In-FF[(256)1]-Int[(256)1]-Out0.2760.4670.3550.383
        In-FF[(256)2]-Int[(256)2]-Out0.1030.2360.0400.083
        In-FF[(256)3]-Int[(256)3]-Out0.0940.2250.0210.046
        In-FF[(256)4]-Int[(256)4]-Out0.0870.2160.0150.028
        In-FF[(256)5]-Int[(256)5]-Out0.0810.1960.0120.023
        In-FF[(256)6]-Int[(256)6]-Out0.0800.1920.0120.022

      Supplementary Materials

      • Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/4/8/eaat5218/DC1

        Section S1. DEER kernel derivation

        Section S2. Performance illustrations for networks of different depth

        Section S3. Effects of transfer functions, choke points, and bias vectors

        Section S4. Behavior of Tikhonov regularization for exchange-coupled systems

        Section S5. Behavior of neural networks with the increasing level of noise

        Fig. S1. DEERNet performance illustration, distance distribution recovery: two-layer feedforward network, fully connected, with 256 neurons per layer.

        Fig. S2. DEERNet performance illustration, distance distribution recovery: three-layer feedforward network, fully connected, with 256 neurons per layer.

        Fig. S3. DEERNet performance illustration, distance distribution recovery: four-layer feedforward network, fully connected, with 256 neurons per layer.

        Fig. S4. DEERNet performance illustration, form factor recovery: two-layer feedforward network, fully connected, with 256 neurons per layer.

        Fig. S5. DEERNet performance illustration, form factor recovery: three-layer feedforward network, fully connected, with 256 neurons per layer.

        Fig. S6. DEERNet performance illustration, form factor recovery: four-layer feedforward network, fully connected, with 256 neurons per layer.

        Fig. S7. Tikhonov analysis of synthetic data produced as described in the main text and featuring a unimodal distance distribution in the presence of a fixed exchange coupling (cf. Fig. 17).

        Fig. S8. A randomly generated DEER data set with the noise SD set at 2.5% of the modulation depth and the resulting distance distribution reconstruction by DEERNet.

        Fig. S9. A randomly generated DEER data set with the noise SD set at 10% of the modulation depth and the resulting distance distribution reconstruction by DEERNet.

        Fig. S10. A randomly generated DEER data set with the noise SD set at 30% of the modulation depth and the resulting distance distribution reconstruction by DEERNet.

        Table S1. Distance distribution recovery performance statistics for feedforward networks with hyperbolic tangent sigmoid (tansig) and logistic sigmoid (logsig) transfer function at the last layer.

        Table S2. Performance statistics for a family of feedforward networks set up as a sequence of fully connected layers with a choke point in the position indicated.

      • Supplementary Materials

        This PDF file includes:

        • Section S1. DEER kernel derivation
        • Section S2. Performance illustrations for networks of different depth
        • Section S3. Effects of transfer functions, choke points, and bias vectors
        • Section S4. Behavior of Tikhonov regularization for exchange-coupled systems
        • Section S5. Behavior of neural networks with the increasing level of noise
        • Fig. S1. DEERNet performance illustration, distance distribution recovery: two-layer feedforward network, fully connected, with 256 neurons per layer.
        • Fig. S2. DEERNet performance illustration, distance distribution recovery: three-layer feedforward network, fully connected, with 256 neurons per layer.
        • Fig. S3. DEERNet performance illustration, distance distribution recovery: four-layer feedforward network, fully connected, with 256 neurons per layer.
        • Fig. S4. DEERNet performance illustration, form factor recovery: two-layer feedforward network, fully connected, with 256 neurons per layer.
        • Fig. S5. DEERNet performance illustration, form factor recovery: three-layer feedforward network, fully connected, with 256 neurons per layer.
        • Fig. S6. DEERNet performance illustration, form factor recovery: four-layer feedforward network, fully connected, with 256 neurons per layer.
        • Fig. S7. Tikhonov analysis of synthetic data produced as described in the main text and featuring a unimodal distance distribution in the presence of a fixed exchange coupling (cf. Fig. 17).
        • Fig. S8. A randomly generated DEER data set with the noise SD set at 2.5% of the modulation depth and the resulting distance distribution reconstruction by DEERNet.
        • Fig. S9. A randomly generated DEER data set with the noise SD set at 10% of the modulation depth and the resulting distance distribution reconstruction by DEERNet.
        • Fig. S10. A randomly generated DEER data set with the noise SD set at 30% of the modulation depth and the resulting distance distribution reconstruction by DEERNet.
        • Table S1. Distance distribution recovery performance statistics for feedforward networks with hyperbolic tangent sigmoid (tansig) and logistic sigmoid (logsig) transfer function at the last layer.
        • Table S2. Performance statistics for a family of feedforward networks set up as a sequence of fully connected layers with a choke point in the position indicated.

        Download PDF

        Files in this Data Supplement:

      Navigate This Article