Research ArticleRESEARCH METHODS

Chimpanzee face recognition from videos in the wild using deep learning

See allHide authors and affiliations

Science Advances  04 Sep 2019:
Vol. 5, no. 9, eaaw0736
DOI: 10.1126/sciadv.aaw0736
  • Fig. 1 Fully unified pipeline for wild chimpanzee face tracking and recognition from raw video footage.

    The pipeline consists of the following stages: (A) Frames are extracted from raw video. (B) Detection of faces is performed using a deep CNN single-shot detector (SSD) model. (C) Face tracking, which is implemented using a Kanade-Lucas-Tomasi (KLT) tracker (25) to group detections into face tracks. (D) Facial identity and sex recognition, which are achieved through the training of deep CNN models. (E) The system only requires the raw video as input and produces labeled face tracks and metadata as temporal and spatial information. (F) This output from the pipeline can then be used to support, for example, social network analysis. (Photo credit: Kyoto University, Primate Research Institute)

  • Fig. 2 Face recognition results demonstrating the CNN model’s robustness to variations in pose, lighting, scale, and age over time.

    (A) Example of a correctly labeled face track. The first two faces (nonfrontal) were initially labeled incorrectly by the model but were corrected automatically by recognition of the other faces in the track, demonstrating the benefit of our face track aggregation approach. (B) Examples of chimpanzee face detections and recognition results in frames extracted from raw video. Note how the system has achieved invariance to scale and is able to perform identification despite extreme poses and occlusions from vegetation and other individuals. (C) Examples of correctly identified faces for two individuals. The individuals age 12 years from left to right (top row: from 41 to 53 years; bottom row: from 2 to 14 years). Note how the model can recognize extreme profiles, as well as faces with motion blur and lighting variations. (Photo credit: Kyoto University, Primate Research Institute)

  • Fig. 3 Face detection and recognition results.

    (A) Histograms of detection numbers for individuals in the training and test years of the dataset (2000, 2004, 2006, 2008, 2012, and 2013). (B) Output of model for number of individuals detected in each year and proportion of individuals in different age categories based on existing estimates of individual ages.

  • Fig. 4 Social networks of the Bossou community generated from co-occurrence matrices constructed using detections of the face recognition model.

    Each node represents an individual chimpanzee. Node size corresponds to the individual’s degree centrality—the total number of “edges” (connections) they have (the higher the degree centrality, the larger the node). Node colors correspond to subclusters of the community as identified independently in each year using the Louvain community detection algorithm (23). Individuals whose ID codes begin with the same letter belong to the same matriline; IDs in capital letters correspond to males, while IDs with only the first letter capitalized correspond to females (see table S1). Within these clusters, as predicted, mothers and young infants have the strongest co-occurrences, and kin cluster into the same subgroups.

  • Fig. 5 Preliminary results from the face detector model tested on other primate species.

    Top row: P. troglodytes schweinfurthii, Pan paniscus, Gorilla beringei, Pongo pygmaeus, Hylobates muelleri, and Cebus imitator. Bottom row: Papio ursinus (x2), Chlorocebus pygerythrus (x2), Eulemur macaco, and Nycticebus coucang. Image sources: Chimpanzee: www.youtube.com/watch?v=c2u3NKXbGeo; Bonobo: www.youtube.com/watch?v=JF8v_HWvfLc&t=9s; Gorilla: www.youtube.com/watch?v=wDECqJsiGqw&t=28s; Orangutan: www.youtube.com/watch?v=Gj2W5BHu-SI;Gibbon: www.youtube.com/watch?v=C6HucIWKsVc;Capuchin: Lynn Lewis-Bevan (personal data); Baboon: Lucy Baehren (personal data); Vervet monkey: Lucy Baehren (personal data); Loris: www.youtube.com/watch?v=2Syd_BUbl5A&t=2s.

Supplementary Materials

  • Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/5/9/eaaw0736/DC1

    Fig. S1. Face detector results.

    Fig. S2. Screenshots of the web-based annotation interfaces.

    Fig. S3. Screenshots from the web-based experiment testing human annotator performance at identifying individual chimpanzees in cropped images.

    Fig. S4. Frame-level accuracy of model with variation in chimpanzee face resolution.

    Table S1. Name, ID code, sex, age, and years present for every chimpanzee at Bossou within the dataset analyzed.

    Table S2. Summary statistics of training and testing datasets for recognition model.

    Table S3. Identity and sex recognition results for accuracy on all faces and frontal faces only in the test set.

    Table S4. Confusion matrix for the 13 individuals in the test set.

    Table S5. Metrics of Bossou social networks derived from co-occurrences of detected individuals in video frames.

    Movie S1. Video demo of automated identity and sex recognition of wild chimpanzees at Bossou, achieved through our deep learning pipeline.

  • Supplementary Materials

    The PDF file includes:

    • Fig. S1. Face detector results.
    • Fig. S2. Screenshots of the web-based annotation interfaces.
    • Fig. S3. Screenshots from the web-based experiment testing human annotator performance at identifying individual chimpanzees in cropped images.
    • Fig. S4. Frame-level accuracy of model with variation in chimpanzee face resolution.
    • Table S1. Name, ID code, sex, age, and years present for every chimpanzee at Bossou within the dataset analyzed.
    • Table S2. Summary statistics of training and testing datasets for recognition model.
    • Table S3. Identity and sex recognition results for accuracy on all faces and frontal faces only in the test set.
    • Table S4. Confusion matrix for the 13 individuals in the test set.
    • Table S5. Metrics of Bossou social networks derived from co-occurrences of detected individuals in video frames.
    • Legend for movie S1

    Download PDF

    Other Supplementary Material for this manuscript includes the following:

    • Movie S1 (.mp4 format). Video demo of automated identity and sex recognition of wild chimpanzees at Bossou, achieved through our deep learning pipeline.

    Files in this Data Supplement:

Navigate This Article