Research ArticleEVOLUTIONARY BIOLOGY

Deep learning on butterfly phenotypes tests evolution’s oldest mathematical model

See allHide authors and affiliations

Science Advances  14 Aug 2019:
Vol. 5, no. 8, eaaw4967
DOI: 10.1126/sciadv.aaw4967
  • Fig. 1 Phylogenetic relationships between subspecies of H. erato and H. melpomene.

    (A to C) Neighbor-joining consensus networks of phenotypic distance (eight sampled embedding axes, 100 replicates). (D to F) Tree space visualizations for phylogenies based on phenotype (orange and red, 32 embedding axes; red, hybrids excluded; cyan, 64 embedding axes), color pattern genes (green) (27), neutral genes (yellow), and randomized topologies (blue, 1000 replicates). (A and D) All subspecies. (B and E) Only H. erato. (C and F) Only H. melpomene. (A to C) Node label color indicates species (black, H. erato and gray, H. melpomene). Node colors show mimicry groups (tables S1 and S9). Node numbers show subspecies numbers (table S1).

  • Fig. 2 Principal component visualization of phenotypic variation among Heliconius butterflies.

    Principal component scores calculated for 2468 images of butterfly species H. erato and H. melpomene on the basis of image coordinates in a 64-dimensional phenotypic space, generated using a deep convolutional triplet network. Cumulative variance explained by displayed principal component axes: 1, 28%; 2, 50%; 3, 68%; and 4, 81%. (A) Butterfly subspecies 1 to 38 (Fig. 1 and table S7). (B) Twelve traditionally hypothesized (12) mimicry complexes (tables S1 and S9) of H. erato and H. melpomene subspecies (gray circles indicate nonmimics, not included in any of these mimicry complexes). (C) Six hierarchical clusters. (D) Two broad classes of type pattern for each subspecies (table S8), with orange rays (orange circles) or without rays (black circles). (E and F) Species, H. erato (black circles) and H. melpomene (gray circles).

  • Fig. 3 Average pairwise Euclidean phenotypic distances between subspecies of H. erato and H. melpomene.

    (A) Box plot of mean pairwise phenotypic distances (table S4) within subspecies (identity), between co-mimic subspecies (mimicry), and between all other subspecies (other). Sample sizes are 38, 19, and 684 subspecies pairs, respectively. (B) Separated species, H. erato (black labels, identity and other) and H. melpomene (gray) and interspecies co-mimics (mimicry). Sample sizes are 21, 17, 15, 209, and 133 pairs. Boxes show 25 to 75% quartiles; horizontal lines, medians; whiskers, inner fence within 1.5 × box height; circles and asterisks, outliers, respectively, within or beyond 3 × box height.

  • Fig. 4 Comparative analysis of the extent of phenotypic convergence in mimicry.

    Case study from comparative analyses of 12 subspecies (fig. S7). The locations of two focal co-mimics (H. erato cyrbia, dark blue circles; H. melpomene cythera, dark red circles) in phenotypic space are compared alongside their nearest conspecifics (H. erato venus, light blue circles; H. melpomene vulcanus, light red circles). Subspecies are illustrated by dorsal photographs of the butterfly closest to the mean location for the subspecies. Gray circles indicate images of other subspecies in the dataset. Axes show the squared distance from the mean location of the focal co-mimic, summed across all 64 spatial embedding axes. Distance between subspecies means on the y axis, H. erato venusH. erato cyrbia = 0.26. Distance between subspecies means on the x axis, H. melpomene vulcanusH. melpomene cythera = 0.41.

  • Fig. 5 Conceptual diagrams illustrating the evolutionary alternatives of mutual convergence, mutual divergence, and one-sided convergence (“advergence”).

    Dashed arrows indicate the direction of evolutionary change. Left (A, D, and G): Mutual convergence in focal taxa (focal taxa, gray circles) with reciprocal transfer of pattern features (e.g., forewing band shape versus wing color) between two clades (1 and 2, respectively black versus gray outlines). Middle (B, E, and H): Reversed polarity with mutual divergence from the focal taxa. Right (C, F, and I): Advergence by one clade onto another (13). Asterisks indicate new derived patterns (feature combinations). When expressed in terms of the phenotypic distance from the focal taxa (G to I), mutual convergence (G) is characterized by a decreasing distance along the arrow of evolutionary change in both clades. Mutual divergence (H) is characterized by a increasing distance in both clades. Advergence (13) is characterized by a decreasing distance (and a greater distance traveled) in one clade (I).

  • Table 1 Hierarchical clusters of subspecies from H. erato and H. melpomene calculated from the phenotypic spatial embedding.

    Cluster membership is based on the modal specimen value for each subspecies after exclusion of hybrid specimens (table S2). Subspecies are illustrated by the dorsal photograph closest to the subspecies principal component analysis centroid (corresponding to Fig. 2A). N indicates subspecies number (table S1). Cluster numbers are colored by value to highlight divisions.


    Embedded Image

Supplementary Materials

  • Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/5/8/eaaw4967/DC1

    Supplementary Methods

    Fig. S1. Diagram of the architecture of the deep learning network ButterflyNet used in this study.

    Fig. S2. Geographic localities for sampled butterfly specimens from the polymorphic mimicry complex of H. erato and H. melpomene.

    Fig. S3. Heatmap showing mean pairwise phenotypic and geographic distances between 38 subspecies of H. erato (black labels) and H. melpomene (gray labels).

    Fig. S4. Collections of specimen photographs used in this study, grouped by subspecies.

    Fig. S5. Average pairwise Euclidean geographic distances between subspecies of H. erato and H. melpomene.

    Fig. S6. Neighbor-joining trees of phenotypic distance between subspecies of H. erato and H. melpomene.

    Fig. S7. Comparative analyses of the extent of phenotypic convergence in mimicry.

    Fig. S8. Principal component visualization of Heliconius butterflies.

    Table S1. Traditionally hypothesized co-mimic subspecies of H. erato and H. melpomene.

    Table S2. Taxonomic and locality data recorded for historical specimens of H. erato and H. melpomene held in the collections of the NHM London.

    Table S3. Coordinates of butterfly images on 64 axes of a Euclidean phenotypic space constructed using a deep convolutional network with triplet training.

    Table S4. Mean pairwise Euclidean phenotypic distances between subspecies from H. erato and H. melpomene.

    Table S5. Mean pairwise squared Euclidean phenotypic distances between subspecies from H. erato and H. melpomene.

    Table S6. Mean pairwise Euclidean geographic distances between subspecies from H. erato and H. melpomene.

    Table S7. Number of sampled butterfly individuals for each subspecies.

    Table S8. Broad pattern class of the type specimen of each subspecies.

    Table S9. Traditionally hypothesized mimicry complexes of H. erato and H. melpomene subspecies.

    Table S10. Statistical comparisons of pairwise Robinson-Foulds distances between sets of phylogenetic trees.

    Table S11. Statistical comparisons with hybrids excluded of pairwise Robinson-Foulds distances between sets of phylogenetic trees.

    Supplementary Computer Code in ipynb Format

    Supplementary Computer Code in PDF Format

    Reference (41)

  • Supplementary Materials

    The PDF file includes:

    • Supplementary Methods
    • Fig. S1. Diagram of the architecture of the deep learning network ButterflyNet used in this study.
    • Fig. S2. Geographic localities for sampled butterfly specimens from the polymorphic mimicry complex of H. erato and H. melpomene.
    • Fig. S3. Heatmap showing mean pairwise phenotypic and geographic distances between 38 subspecies of H. erato (black labels) and H. melpomene (gray labels).
    • Fig. S4. Collections of specimen photographs used in this study, grouped by subspecies.
    • Fig. S5. Average pairwise Euclidean geographic distances between subspecies of H. erato and H. melpomene.
    • Fig. S6. Neighbor-joining trees of phenotypic distance between subspecies of H. erato and H. melpomene.
    • Fig. S7. Comparative analyses of the extent of phenotypic convergence in mimicry.
    • Fig. S8. Principal component visualization of Heliconius butterflies.
    • Reference (41)

    Download PDF

    Other Supplementary Material for this manuscript includes the following:

    • Table S1 (Microsoft Excel format). Traditionally hypothesized co-mimic subspecies of H. erato and H. melpomene.
    • Table S2 (Microsoft Excel format). Taxonomic and locality data recorded for historical specimens of H. erato and H. melpomene held in the collections of the NHM London.
    • Table S3 (Microsoft Excel format). Coordinates of butterfly images on 64 axes of a Euclidean phenotypic space constructed using a deep convolutional network with triplet training.
    • Table S4 (Microsoft Excel format). Mean pairwise Euclidean phenotypic distances between subspecies from H. erato and H. melpomene.
    • Table S5 (Microsoft Excel format). Mean pairwise squared Euclidean phenotypic distances between subspecies from H. erato and H. melpomene.
    • Table S6 (Microsoft Excel format). Mean pairwise Euclidean geographic distances between subspecies from H. erato and H. melpomene.
    • Table S7 (Microsoft Excel format). Number of sampled butterfly individuals for each subspecies.
    • Table S8 (Microsoft Excel format). Broad pattern class of the type specimen of each subspecies.
    • Table S9 (Microsoft Excel format). Traditionally hypothesized mimicry complexes of H. erato and H. melpomene subspecies.
    • Table S10 (Microsoft Excel format). Statistical comparisons of pairwise Robinson-Foulds distances between sets of phylogenetic trees.
    • Table S11 (Microsoft Excel format). Statistical comparisons with hybrids excluded of pairwise Robinson-Foulds distances between sets of phylogenetic trees.
    • Supplementary Computer Code

    Files in this Data Supplement:

Stay Connected to Science Advances

Navigate This Article