Research ArticleAPPLIED SCIENCES AND ENGINEERING

Machine learning in a data-limited regime: Augmenting experiments with synthetic data uncovers order in crumpled sheets

See allHide authors and affiliations

Science Advances  26 Apr 2019:
Vol. 5, no. 4, eaau6792
DOI: 10.1126/sciadv.aau6792
  • Fig. 1 Examples of crease networks.

    (A) A 10 cm by 10 cm sheet of Mylar that has undergone a succession of rigid flat folds. (B) A sheet of Mylar that has been crumpled. (C) A simulated rigid flat-folded sheet. The sheet has been folded 13 times. Ridges are colored red, and valleys are blue.

  • Fig. 2 A schematic of the processing pipeline.

    (A) From the height map, a mean curvature map is calculated and denoised with a Radon transform–based method. Valleys (black) and ridges (red) are separated. The binary image of the valleys (X) is the input to the neural network (N). The distance transform of the binary image of the ridges is the target (Y). Brighter colors represent regions closer to ridges. These color conventions are consistent through all the figures in this paper. (B) Two samples of predictions on generated data. The true fold network is superimposed on the predicted distance map. It is seen that the true ridges (red) coincide perfectly with the bright colors, demonstrating strong predictive power. Below the predictions, we show confusion matrices, with the nearest third of pixels, the middle third, and the furthest third. (C) Two predictions, as well as their corresponding confusion matrices, using the network trained on generated data (without noise) and applied to experimental scans.

  • Fig. 3 Effect of noise type on prediction.

    (A to E) An example noised image (top), an example prediction (middle), and the corresponding confusion matrix (bottom) for different types of artificial noise. Noise types are described concisely in the title of each panel, and complete specifications are given in Materials and Methods. (F) The upper left value of the confusion matrix when each pixel of the near-perfect prediction from Fig. 2B was randomly toggled with probability P. (G) The network from (E) applied on an additional experimental scan (from left panel of Fig. 2C). The average confusion matrix on all experimental scans is shown.

  • Fig. 4 Predictions on crumpling.

    (A) One sheet that was successively crumpled, shown after four and seven crumpling iterations. Color code follows Fig. 2. (B) Closeups on selected smaller patches from the same image, broken down to prediction, prediction and target, and prediction and input.

  • Fig. 5 Effect of fraction generated data.

    (A) Three quantifications of the predictive power of the model when trained on varying amounts of generated data and a constant amount of crumpling data. Strong predictive power corresponds to low loss (red) and large Pearson correlation and classification accuracy (blue and green, respectively). (B) Deterioration (see main text) for each sheet in the validation set, as a function of the rescaled loss. Colors correspond to different perturbations and marker styles to cross-validation sample. It is seen that all tested perturbations lead to worse predictive power (above the gray reference line). The few points below the reference line occur at high crumple number and low absolute loss. (C) Histogram of all points in (B). Values to the right of the red line correspond to deterioration when using unphysical data. (D) Example target and predictions for the various models considered in previous panels.

Supplementary Materials

  • Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/5/4/eaau6792/DC1

    Section S1. Radon transform–based detection method

    Section S2. In silico generation of flat-folding data

    Section S3. Prediction on 16 sheets

    Section S4. Probing the network: Ongoing work

    Section S5. Another approach to error quantification

    Section S6. Perturbing the in silico data

    Fig. S1. In silico–generated flat-folded crease networks.

    Fig. S2. Comparison between the preprocessed curvature map and the linearized version.

    Fig. S3. Prediction on a sheet that was crumpled 16 times.

    Fig. S4. Additional test results.

    Fig. S5. Prediction accuracy.

    Fig. S6. Examples of perturbed in silico data.

    Reference (39)

  • Supplementary Materials

    This PDF file includes:

    • Section S1. Radon transform–based detection method
    • Section S2. In silico generation of flat-folding data
    • Section S3. Prediction on 16 sheets
    • Section S4. Probing the network: Ongoing work
    • Section S5. Another approach to error quantification
    • Section S6. Perturbing the in silico data
    • Fig. S1. In silico–generated flat-folded crease networks.
    • Fig. S2. Comparison between the preprocessed curvature map and the linearized version.
    • Fig. S3. Prediction on a sheet that was crumpled 16 times.
    • Fig. S4. Additional test results.
    • Fig. S5. Prediction accuracy.
    • Fig. S6. Examples of perturbed in silico data.
    • Reference (39)

    Download PDF

    Files in this Data Supplement:

Stay Connected to Science Advances

Navigate This Article