Research ArticleHUMAN GENETICS

A GWAS in Latin Americans identifies novel face shape loci, implicating VPS13B and a Denisovan introgressed region in facial variation

See allHide authors and affiliations

Science Advances  05 Feb 2021:
Vol. 7, no. 6, eabc6160
DOI: 10.1126/sciadv.abc6160


To characterize the genetic basis of facial features in Latin Americans, we performed a genome-wide association study (GWAS) of more than 6000 individuals using 59 landmark-based measurements from two-dimensional profile photographs and ~9,000,000 genotyped or imputed single-nucleotide polymorphisms. We detected significant association of 32 traits with at least 1 (and up to 6) of 32 different genomic regions, more than doubling the number of robustly associated face morphology loci reported until now (from 11 to 23). These GWAS hits are strongly enriched in regulatory sequences active specifically during craniofacial development. The associated region in 1p12 includes a tract of archaic adaptive introgression, with a Denisovan haplotype common in Native Americans affecting particularly lip thickness. Among the nine previously unidentified face morphology loci we identified is the VPS13B gene region, and we show that variants in this region also affect midfacial morphology in mice.


Because of its evolutionary, biomedical, and forensic significance, there is an increasing interest in the characterization of the genetic basis of human facial variation. Thus far, nine genome-wide association (GWA) studies (GWASs) have reported ~50 genetic loci associated with facial features in humans (112), although only about a dozen of these loci show significant association signals across independent studies, representing the most robust findings so far. Other than stochastic variation, the limited overlap of findings from independent GWASs could relate to various methodological and biological factors, including a different genetic architecture of facial variation across human populations.

We recently initiated a research program focused on the genetic basis of physical appearance in a sample of ~7000 individuals recruited in Latin America the Consortium for the Analysis of the Diversity and Evolution of Latin America (CANDELA cohort) (13). Our study sample is mostly of mixed European-Native American ancestry and shows extensive phenotypic variation and high genetic diversity (13, 14). This sample has high power for the genetic analysis of physical appearance in a considerably understudied population (4, 1517). In a previous analysis of 14 facial features (characterized mainly using a discrete, categorical scoring approach), we reported genome-wide significant associations with single-nucleotide polymorphisms (SNPs) at six genomic loci (4). These loci harbor candidate genes known to be involved in mammalian craniofacial development/evolution, and four have now been replicated in other study samples, using a range of phenotyping approaches (10, 18).

From an evolutionary perspective, facial projection shows extensive variation across extinct and extant mammals, with aspects of brow-ridge, midface, and mandible projection being defining features of primate and hominid taxa (19). Such changes in facial projection have often been proposed to result from the action of selection, including various environmental adaptations, and possibly (particularly in hominids) social factors (20). In modern humans, the facial profile varies greatly between individuals and populations and has been shown to be highly informative in various perception domains, including gender and age attribution, individual recognition, and attractiveness (21).

Here, we present results stemming from GWA analyses of the facial profile in the CANDELA cohort. We evaluated 59 quantitative traits (i.e., landmark-based measurements) and found significant associations at 32 genomic regions (9 previously unidentified, 4 of which replicate in a European meta-analysis). We provide robust evidence of association for 23 loci that have been implicated by previous GWASs, more than doubling the number of well-validated face morphology loci. We find that associated SNPs are strongly enriched in enhancers active during the late organogenesis phase of human development, including many that are specifically active during craniofacial development. In the region of association in 1p12 (overlapping WARS2/TBX15), we find a Denisovan haplotype affecting face morphology, particularly lip thickness. Among the four previously unidentified face morphology loci that replicate in Europeans is VPS13B (associated with nose shape), and we demonstrate that variants of the homologous region in the mouse (Vps13b) also influence nasal morphology.


Study sample and phenotyping

We examined individuals from the CANDELA study cohort, collected in five Latin American countries (13). Two operators placed 19 landmarks and 22 semi-landmarks (fig. S1), primarily along the midfacial contour, on right profile two-dimensional (2D) photographs of 6169 subjects (3408 women and 2761 men). After Procrustes superposition, we defined 59 measurements (distances, ratios, and angles) based on the landmarks/semi-landmarks. These measurements quantify aspects of the morphology of the upper, middle, and lower face, with similar measurements having been examined in a range of morphological studies of the human face, including genetic, medical, anthropological, and forensic applications (Fig. 1 and table S1). We found these quantitative traits to be reliably assessed (see Methods and table S2), approximately normally distributed and with extensive variation in the CANDELA sample. The individuals studied were genotyped on Illumina’s OmniExpress BeadChip. After quality control filters, we retained 671,038 SNPs for further analyses.

Fig. 1 Face profile features showing genome-wide significant association.

Top: Drawings indicate the features for which the 32 traits listed below were measured in the CANDELA individuals (as defined in table S1). Bottom: Aggregate of the GWAS signals detected (across all traits). The chromosomal region (and nearest candidate gene) showing strongest association to a trait is indicated above each GWAS peak (bold type marks the previously unidentified face morphology regions identified here). Curved lines in the middle of the figure connect the previously unidentified face morphology region to their associated traits.

Other than the high correlation (r > 0.8) seen for similar traits from the same anatomical structure (e.g., nose roundedness and lip thickness), most traits were found to have a low to moderate correlation (table S3). On the basis of genome-wide SNP data, average admixture proportions in the CANDELA individuals examined were estimated as 51% European, 45% Native American, and 4% sub-Saharan African. Most traits showed a low to moderate (but generally significant) correlation with age, body mass index (BMI), sex, and continental genetic ancestry (table S4). The strongest effects for age, BMI, and sex were seen, respectively, for measures of lip thickness (r ~ 0.33), traits affected by neck thickness (r ~ 0.49), and brow-ridge protrusion (r ~ 0.51). The strongest effect of Native American/European ancestry estimates [themselves strongly negatively correlated (13)] was for a measure of nasion position (r ~ 0.38). In addition, several measures of nose/chin protrusion and lip thickness had correlations ~0.20 with Native American/European ancestry. Subcontinental genetic ancestry (North Europe v. Iberia; Central Andes v. Southern South America) also correlated significantly with certain traits, particularly of the midface (table S4), consistent with previous observations (14).

On the basis of a kinship matrix derived from the SNP data, we estimated narrow-sense trait heritability (h2) using LDAK5 (22). We found moderate (but mostly highly significant) heritability values for all traits, ranging from 0.20 to 0.61 (table S3), consistent with independent estimates (2, 18). The highest heritability (>0.60) was observed for measures of brow-ridge, nose, and lip protrusion. Genetic correlations among traits were approximately proportional to phenotypic correlations (except for traits with low h2), suggesting a similar role for genetic and environmental effects across traits (table S3).

GWA analyses

We performed imputation using 1000 Genomes Phase 3 data (23), which allowed us to include up to 8,703,729 autosomal SNPs in the GWA analyses. For association testing, we used multiple linear regression, as implemented in PLINK (24), with an additive genetic model adjusting for age, sex, BMI, landmarking operator, and the first six principal components (PCs) computed from the SNP data. A total of 32 genomic regions (including 2684 SNPs) showed genome-wide significant association (P value < 5 × 10−8) with at least one trait (and up to six) (Figs. 1 and 2, Table 1, and table S5). These regions often also showed association with additional traits at a genome-wide suggestive significance threshold (P value < 10−5; Fig. 2). Conversely, of the 59 traits examined, 32 showed genome-wide significant association with at least one genomic region (and up to five regions), as well as genome-wide suggestive association with regions significantly associated with other traits (Fig. 2). Index SNPs (i.e., the one with the smallest P value in a region) at loci showing strongest association with a trait explain 0.4 to 0.9% of trait variance (median = 0.52%), in line with previous estimates of genetic effects in GWASs of facial variation (1, 15, 18).

Fig. 2 Thirty-two chromosomal regions associated with face profile traits.

The 32 traits showing genome-wide significant association are listed at the top (Fig. 1 and table S1). For ease of visualization, association P values are represented by colored boxes as follows: red, genome-wide significant (<5 × 10−8); orange, genome-wide suggestive (<10−5) (P values are provided in table S5). Chromosomal regions in bold were previously unidentified (Fig. 1 and Table 1). Prior evidence: GWASa, associated with a similar face trait; GWASb, associated with a different facial trait; GWASc, associated with a facial development disorder [e.g., nonsyndromic cleft lip/palate (NSCL/P)]; HDD, human developmental defect reported (e.g., Mendelian disorder); MDD, model developmental defect (related phenotype involving a homologous region reported in animal models).

Table 1 Features of the nine chromosomal regions newly associated with face profile traits in the CANDELA sample.

Candidate genes where the index SNP is intragenic are underlined. Association P values in bold are genome-wide significant (P ≤ 5 × 10−8). Rep., European meta-analysis; AFR, sub-Saharan African; EUR, European (from 1000 genomes); NAM, Native American (14); CAN, CANDELA. “# SNPs” refers to the total number of genome-wide significant SNPs in that chromosomal region (P value ≤ 5 × 10−8).

View this table:

To account for the number of traits and SNPs tested, we also calculated adjusted genome-wide significance thresholds following a global false discovery rate (FDR) procedure (25), at two overall significance levels (α = 1% and α = 5%). The commonly used threshold of 5 × 10−8 was slightly more stringent than the calculated FDR thresholds (see Methods); therefore, the associations identified as significant above remain significant after FDR adjustment for multiple testing.

Integration of face profile GWAS hits with the literature

For 24 of the 32 regions showing significant trait associations, there is substantial literature supporting an involvement in craniofacial morphology (summarized in Fig. 2 and table S5). This evidence is of five broad types, with certain loci being implicated by more than one line of evidence: (i) Some hits represent direct replications of previous GWAS findings. That is, the same (or a similar) facial trait has been associated with the same genomic region in previous studies. The most prominent example is nasion position (measured in various ways), which has been repeatedly associated with SNPs in the PAX3 region on 2q35 (1, 2, 10, 15, 18). Here, we replicate this finding in that we observe an association of this region with measures sensitive to nasion position (table S1). (ii) Some of the genome regions showing association have been detected in previous face morphology GWAS but for traits other than those associated here, suggesting an effect of these regions on various aspects of facial morphology. A prime example is the WARS2/TBX15 gene region in 1p12. We initially reported association of SNPs in this region with rolling of the helix (in the outer ear) (15); this region was subsequently associated with midface morphology and frontal bossing (10, 18), and in the current study, we find it associated with two measures of lip thickness (traits 35 and 36; Fig. 2 and table S1). (iii) Certain genomic regions identified here have been implicated by previous GWASs of craniofacial anomalies, particularly nonsyndromic cleft lip/palate (NSCL/P). For instance, SNPs in 8q24.21, downstream of MYC, have been strongly associated with NSCL/P (26), and here we find this region associated with philtrum length, consistent with genes in the region being involved in the development of this part of the midface. (iv) For some of the associated chromosomal regions highlighted here, there is evidence in the literature of their involvement in complex craniofacial abnormalities, including Mendelian disorders. For instance, the 8q22.2 region associated here with columella size (trait 27) has been involved in Cohen syndrome (OMIM #216550). Last, (v) some of the regions associated here have been previously implicated in craniofacial development in animal models. For instance, the Tabby mouse mutant (15) presents a range of craniofacial abnormalities caused by a duplication involving the Tbx15 gene in the 1p12 homologous region, and here we found this locus to be associated with lower lip thickness and lip thickness ratio. In Supplementary Notes, we provide comments on the 23 genomic regions detected here for which GWAS evidence of involvement in with nonpathological craniofacial variation has been reported previously.

Regulatory annotation enrichment analysis

Index SNPs at the GWAS hits identified here occur primarily in noncoding regions (table S5). We therefore evaluated the potential enrichment of SNPs in these regions within enhancers identified by chromatin 25-state annotations in 150 human tissues and cell types across embryonic, fetal, and adult time points (including multiple stages of human craniofacial development) (27). The index SNPs we identified are not necessarily causal and are often in strong linkage disequilibirium (LD) with other SNPs within a region (e.g., Figs. 3 and 4). We therefore identified proxy variants in LD with the index variant (r2 > 0.8) up to 1 Mb away. We then used the program GREGOR (28) to evaluate the enrichment of these SNPs (relative to an equal number of randomly selected SNPs) in regulatory annotations for each of the tissues and cell types mentioned above. We found a ~2-fold and ~4-fold enrichment of SNPs at face GWAS hits in all craniofacial enhancers and in craniofacial-specific enhancers, respectively (Bonferroni-corrected P values < 10−50) (fig. S2). Craniofacial-specific enhancers have not been detected in any other human tissues or stages of development and represent a small fraction of all genome annotations (i.e., ~8% of all craniofacial enhancers) (27). The observed enrichment of SNPs in the craniofacial-specific enhancers suggests that these regions play a regulatory role specific to facial development rather than being broadly involved in skeletal biology or embryonic development. Contrasting samples from early and late embryonic stages, we observe that samples from the later stages have the highest (~2-fold; Bonferroni-corrected P value < 10−20) enrichment (fig. S2). This supports previous analyses suggesting that gene expression regulation in late embryonic and early fetal development has the greatest impact on nonpathological adult face morphology (18, 27). Below, we comment on specific regulatory elements present in selected genome regions associated with face morphology.

Fig. 3 Evidence of association, Denisovan introgression, and selection in the WARS2-TBX15 region.

(A) Association P values (right y-axis) with lip thickness ratio 2 and the location and frequency (left y-axis) of introgressed haplotypes inferred in the CANDELA data. Filled dots indicate the SNPs retained for introgression analysis. Dots are plotted accordingly to the significance of their association and have been colored to reflect r2 with the index SNP for association with that trait (rs3790553). The location of the index SNPs associated previously with ear morphology (15), chin dimples (7), and forehead shape (10) is indicated. SNPs associated with facial features in the European meta-analysis (18) are outside the region shown here. The green bins show the introgressed region, and their height displays the allelic frequency and their shade refers to the confidence level at which the tract was called (pale, 20%; dark, 90%). The dashed lines denote the bounds of the Denisovan introgression region previously estimated with the same approach used here (green) (31) or using a Conditional Random Field approach (purple) (32). (B) Standardized PBS scores obtained for genotyped SNPs when contrasting Native Americans (NAM) to CHB + CEU (red triangles) or Greenland Inuits (GI) to CHB + CEU (blue triangles). The location of the TBX15 and WARS2 genes is indicated by black arrows along with various annotations inferred from human embryonic tissue (27) (the color key for combined craniofacial annotations can be found in table S6).

Fig. 4 Features of four genomic regions newly associated with face features in the CANDELA sample and replicating in Europeans.

Regional association P values (index SNP is labeled) are shown at the top. Regulatory annotations from human embryonic tissue (27) (combined chromatin states at various Carnegie stages and craniofacial-specific enhancers and superenhancers; the color key for combined craniofacial annotations is in table S6) are provided below each association plot. A heatmap of pairwise LD (r2) in the region is shown at the bottom of each panel. Similar plots for the other previously unidentified regions, detected in the CANDELA sample but not replicating in the European meta-analysis, are shown in Supplementary Notes.

Association in 1p12 involves SNPs in a region with evidence of adaptive introgression from archaic humans

SNPs in 1p12 overlapping the WARS2/TBX15 gene region were associated here with two measures of lip thickness ratio [Figs. 1 and 2; lip thickness ratio referring to the thickness of the upper lip relative to total lip thickness: upper lip thickness/(upper + lower lip thickness); see table S1]. This genomic region has been previously reported to be associated with features of the outer ear (15) and face (7, 10, 18), but this is the first time it is associated with lip thickness. SNPs in 1p12 have also been implicated in GWASs of anthropometric traits and fat distribution (29). The WARS2/TBX15 gene region has been reported to show a strong signal of selection in Greenland Inuit, as evidenced by extreme Population Branch Statistics (PBS) values (30). This observation has been interpreted as resulting from adaptation to a low environmental temperature (30). Furthermore, the selection signal in this region was shown to overlap with a tract of introgression from archaic humans, most likely Denisovans (31), pointing to the possibility of adaptive introgression.

We evaluated evidence for Denisovan introgression in the WARS2/TBX15 gene region in the CANDELA data using a Hidden Markov Model approach (31). We inferred two overlapping introgressed tracts called at confidence levels 20 to 99%, with decreasing size (Fig. 3 and fig. S3). At 99% confidence, the largest and most frequent (34%) introgression tract inferred in the CANDELA data almost exactly matches the 28-kb tract previously inferred in Greenlandic Inuits and various Asian and Native American populations (31). This tract is included within a longer tract of introgression reported previously using an alternative (Conditional Random Field) approach (Fig. 3) (32). We also estimated PBS values in Native American populations related to the ancestral populations contributing to admixture in Latin Americans (14). As previously observed in Greenlandic Inuit (30), we found extreme PBS values in that region, peaking in the tract of Denisovan introgression (Fig. 3). Notably, the SNPs showing significant association with lip thickness in the CANDELA data overlap with the signal of selection and the tract of Denisovan introgression (Fig. 3). The index SNP (rs3790553) is included in the introgressed haplotype inferred at 50% confidence. By contrast, SNPs previously associated with ear morphology (15) and facial features (10, 18) are located over 90 kb away from the tract of Denisovan introgression and have an association r2 with this tract < 0.23 (Fig. 3). Furthermore, the derived allele at rs3790553 is carried by 91% of the introgressed haplotypes, while the ancestral allele of that SNP is carried by 99.7% of the chromosomes in which the introgression tract was not inferred. This level of association was not seen for any of the index SNPs reported in previous GWAS implicating the WARS2/TBX15 region. We examined the effect of the Denisovan haplotype on the profile traits evaluated in the CANDELA sample and found five significant associations (P values < 10−3; fig. S4). Among these, a positive effect is seen on lip thickness ratio and upper lip protrusion, while a negative effect is seen on lower lip thickness and protrusion. There is also a positive effect (albeit not significant) on upper lip thickness. These opposite effects of the Denisovan haplotype on the upper v. lower lips effectively add up in the lip thickness ratio measures, consistent with them showing strongest associations.

Features of loci not significantly associated in previous face morphology GWASs

As shown in Fig. 2, among the 32 genomic regions showing association, 9 have not been genome-wide significant in previous GWASs of nonpathological facial features. We sought replication of the index SNPs from these nine regions in a recent facial morphology GWA meta-analysis that included data for over 10,000 Europeans (18). Traits examined in the meta-analysis consist of pairwise distances between 13 landmarks placed on 3D images. Although we placed in the CANDELA 2D photos the same nine right-profile landmarks used in the European meta-analysis, only 3 of the 59 measurements we obtained correspond to distances included in the European meta-analysis. Considering this limited overlap, we examined association P values in the full set of 36 right-profile distances measured in the European meta-analysis (Table 1). One of our nine index SNPs is not polymorphic in Europeans; therefore, we could not evaluate replication for this SNP. Of the eight index SNPs that are polymorphic in Europeans, four showed replication P values exceeding a Bonferroni-corrected threshold for significance for at least one facial distance (Table 1). Although most of the distances with significant association in the European meta-analysis do not correspond to those showing association in the CANDELA data, they often share landmarks. For example, SNP rs75976055 is associated here with a measure of columella size incorporating the Subnasale landmark (table S1), and the strongest replication P value (5 × 10−5) was observed in the European meta-analysis for the distance between the Subnasale and Labial inferius landmarks.

Figure 4 shows regional association plots for the four previously unidentified facial morphology loci with evidence of replication in the European meta-analysis. In 3q13.32, SNPs in the IGSF11 (immunoglobulin superfamily member 11) region show significant association with nose roundness (and suggestive association with columella inclination). The index SNP in this region (rs7633584) is located within an intron of IGSF11. IGSF11 is a cell adhesion molecule playing a role in osteoclast differentiation and bone resorption (33), but no association with morphological features has been reported previously.

In 7p21.1, an SNP significantly associated with columella inclination is intronic to HDAC9 (Histone deacetylase 9). HDAC9 is an enzyme involved in gene expression regulation through histone acetylation. SNPs in this gene region have been suggestively associated with the vertical position of the sublabial sulcus, relative to the central midface (5). The region of association identified here overlaps with two craniofacial-specific enhancers (Fig. 4). Although HDAC9 does not appear to be expressed in developing mouse craniofacial tissues, many enhancers located within HDAC9 have been shown to form long-range interactions with the neighboring TWIST1 gene during the development of limbs and pharyngeal arches in mice (34). TWIST1 is a transcription factor important in limb and craniofacial development (34), and mutations of TWIST1 have been shown to result in Saethre-Chotzen syndrome, defined by its facial and cranial gestalt (OMIM #101400). These observations suggest that the effect of SNPs showing association with profile traits in the CANDELA individuals could be mediated through regulatory elements within HDAC9 controlling TWIST1 expression during craniofacial development.

SNPs in 7q31.31 associated with jaw protrusion overlap CPED1 (Cadherin-like and PC-esterase domain containing 1), a gene of unknown function. The index SNP in this region (rs6950680) is intronic to CPED1, with a number of regulatory elements active in embryonic craniofacial tissue being present within CEPD1 (Fig. 4). Variants in this region have been associated with height (35) and bone mineral density (36). Of potential relevance, the WINT16 gene is located less than 30 kb downstream of CEPD1. WINT signaling, in concert with other morphogenetic signaling pathways, plays critical roles in embryonic development, including midfacial morphogenesis, and has been implicated in orofacial clefting (37).

Last, in 8q22.2, we find that 597 SNPs overlapping VPS13B (vacuolar protein sorting 13 homolog B) are significantly associated with columella size. Mutations in VPS13B have been reported to cause Cohen syndrome (OMIM #216550), a recessive multisystem disorder with characteristic facial dysmorphism that can include a high nasal bridge, prominent nose, short philtrum, and malar hypoplasia. Below, we describe mouse follow-up analyses we carried out for this gene region.

The five loci with no evidence of replication in the European meta-analysis include genes and regulatory features of potential relevance for craniofacial development. These are discussed in Supplementary Notes.

Mouse follow-up of the VPS13B hit

Since nose development and structure in humans and mice share features and as soft tissues constituting the nose are tightly integrated with the nasal aperture, we conducted geometric morphometric analyses of the facial region of mice skulls to evaluate the potential impact of Vps13b variants. We first carried out geometric morphometric analyses in a recently described mouse Vps13b-KO model of Cohen syndrome (38), compared to matched wild-type (WT) controls (Supplementary Notes). We found that Vps13b genotype has a significant phenotypic effect and explains 11% of the total Procrustes variance in facial and palatal shape (Supplementary Notes and table S3-2). The magnitude of this effect is two times stronger in males (Procrustes distance = 0.030) than in females (Procrustes distance = 0.017). The changes mainly involve the nasal region, resulting in an elongation and widening of the nasal tip and the dorsal aspect of the maxillae (Fig. 5A and movie S1). Concomitant with these changes, there is a narrowed and elevated palate.

Fig. 5 Vps13b and mouse craniofacial shape.

(A) Shape changes in the Vps13b KO mutant male mice relative to WT. Effect is predicted according to the expected marginal means at the populational average centroid size. Darker colors represent strong positive (blue) or negative (brown) deviation from the WT, given in Procrustes distance. Positive distance represents expansion from the WT, whereas negative distance represents contraction from the WT. Mesh warping is amplified by a factor of 2 to facilitate the visualization (see also movie S1). (B) Regional association plot for the VPS13 homologous region in chromosome 15 among outbred mice. Significantly associated SNPs are shown as green dots (FDR < 5%). Ensembl annotations for genes (brown) and enhancers active in embryonic facial prominence (orange) are shown at the top. (C) Phenotypic effect associated with allele dosage at the Vps13b index SNP among outbred mice. Skull zones with an expansion/contraction relative to the mean shape are shown in brown/blue (see also movie S2). To facilitate visualization, shape variation has been magnified by ×20.

In addition to characterizing craniofacial variation in the Vps13b-KO mice, we reexamined data from a GWAS on cranioskeletal variation in outbred mice (39). The published analysis did not report a significant association of univariate craniofacial phenotypes with SNPs in the Vps13b region (39). However, reanalyzing these data using a multivariate mixed model, we found that 22 SNPs in a 1.82-Mb region of chromosome 15 overlapping Vps13b (Fig. 5B) are significantly associated with craniofacial variation (FDR < 5%). SNPs in this region affect the protrusion/shrinkage of the nasal region, in a similar manner as seen for the Vps13b-KO mice (Fig. 5C and movie S2).


Craniofacial form is largely determined during embryonic development through the concerted actions of cranial neural crest cells, mesodermally derived mesenchyme, and ectoderm. The cranial neural crest cell population is the major driver of facial outgrowth and differentiates into the cartilage and bony elements of the face (among other tissues) that principally determine facial structure (40). There has been considerable interest in understanding the role that cranial neural crest genes play in species-specific differences in facial structure, much of which appears to be determined by genetic variation in enhancers that regulate key genes in this cell type (41). Furthermore, neural crest cell–dependent facial form is dictated early in development by signals received from the developing forebrain as well as the overlying ectoderm (42), implying that any a priori assumptions about tissue-specific functions of genes controlling variation in facial form should be avoided. GWASs provide a hypothesis-free approach to pinpointing important genetic variation underlying population-level craniofacial shape differences.

Our previous study on the CANDELA cohort focused on 14 categorical traits scored onto a few discrete categories. Here, we consider a much larger set of phenotypes (a total of 59), and we assessed these quantitatively, based on landmarks and semi-landmarks placed on face profile pictures. Although examined on the same study cohort, the different phenotyping approach used here adds substantial power to the GWA analyses. Statistically, quantitative traits are more informative than categorical ones. Focusing the landmarking on the profile also has several advantages. The landmarks are relatively prominent and easy to assess, and the process is less susceptible to external factors, such as head rotation. In addition to the main landmarks, we also use a set of semi-landmarks to capture in fuller detail variation along the facial contour.

The number of previously reported face GWAS hits replicated here, despite major differences across studies in phenotyping approaches and phenotype definitions, is noticeable: a total of 23, including 7 of the regions that have been repeatedly associated with facial features in previous studies. The fact that we detect significant genetic effects at genome regions previously associated with facial features different from those examined here is consistent with the view that many loci involved in craniofacial development have various effects on facial morphology and that certain facial traits can be affected by overlapping sets of genes (10). Although stochastic effects and differences in facial phenotype characterization across studies could explain the lack of replication of some of the loci associated previously, the genetic architecture of facial morphology in human populations probably also plays a role. Evidence for the impact of genetic structure on facial variation across human populations was observed here in the form of an effect of continental and subcontinental genetic ancestry on profile features, pointing to the existence of trait alleles with differentiated frequencies between continents, as well as between different areas within continents. A prominent example of such continental differentiation in trait allele frequency is the EDAR region associated here (and in the literature) with measures of facial flatness and jaw protrusion (4). The associated EDAR variants are essentially absent in Europe, and therefore, this effect is not detectable in Europeans (4). Similarly, SNPs associated at one of the previously unidentified genome regions identified here are not polymorphic in Europeans, and therefore, we could not test for replication in the European meta-analysis. This variable genetic architecture of facial features across human populations emphasizes the need for further studies in non-Europeans for a fuller characterization of the genetic basis of facial variation in modern humans.

Natural selection appears to have played a major role in the evolution of craniofacial morphology in many mammalian species, particularly in relation to various environmental adaptations, including diet (19). Natural selection has also long been proposed to have played a role on the evolution of the human face (20). However, thus far, there is little evidence for genomic signatures of selection at human face morphology loci (18). From this perspective, it is interesting that we find an overlap of the signal of association of lip thickness ratio at the WARS2/TBX15 region with a signal of selection, also coinciding with a tract of Denisovan introgression (30, 31). There is increasing evidence for adaptive archaic introgression in modern humans, particularly for pigmentation and immune response genes, and for the phenotypic effects of these introgressed tracts being mediated through their regulatory impact on the genome (43). It has thus been suggested that the strong signal of selection seen in the WARS2/TBX15 region could have been driven by adaptation to a cold environment (30). Archaic humans are thought to have been adapted to cold conditions by the time they encountered modern humans migrating from less extreme latitudes, and this could have favored adaptive introgression around WARS2/TBX15 (31). Consistent with the possibility of adaptive pressure to a cold environment, association studies have implicated the WARS2/TBX15 region in body fat distribution, height, and waist-hip ratio (29). However, the SNPs associated with these body features do not overlap with the tract of Denisovan introgression (31). Similarly, SNPs in 1p12 previously associated with ear and face features are located some distance away from the tract of introgression (10, 15, 18) and are not in LD with it (Fig. 3). The Denisovan haplotype has a low frequency in Europeans and a high frequency in East Asians and is nearly fixed in Native Americans (31), and we observe it at a frequency of ~50% in the CANDELA data (in line with the European/Native American admixture of these samples), thus facilitating the detection of any associated phenotypic effects. It is possible that the facial trait association that we see in the CANDELA data for SNPs in 1p12 could relate to an impact of variants in this region on facial fat distribution (perhaps as part of a broader effect on body fat distribution) in East Asians and Native Americans and that this could have been relevant in adaptation to a cold environment. The WARS2/TBX15 region includes a bivalent transcription factor as well as a superenhancer (Fig. 3), hallmarks of critical patterning transcription factors responsible for specifying tissue/cell-type identity (44), in this case active during early human craniofacial development. It will therefore be important to directly examine the role of variants in the WARS2/TBX15 regions (including the Denisovan haplotype) on gene expression regulation and to evaluate the impact of any such changes in gene expression on body fat distribution.

Of the previously unidentified face morphology loci we detected, evidence is most compelling for VPS13B, mutations of which result in Cohen syndrome. Consistent with the effect on columella size that we detect in the CANDELA individuals, patients with Cohen syndrome typically exhibit a beak-shaped, prominent nose (OMIM #216550). In agreement with these phenotypic effects in humans, here we show that mice deficient in Vps13b present a cranioskeletal dysmorphology that includes anteriorly lengthened nasal bones. Other cranioskeletal changes with similarities across humans and mice have been reported for rare mutations at a number of loci, for instance, in Williams syndrome (45). These homologous changes probably reflect genetic effects on related cranial neural crest–derived bony elements of the skull (45). Furthermore, our confirmation that common variants of Vps13b affect midfacial morphology mice establishes a specific role for this gene in craniofacial development, beyond the various phenotypic effects associated with VPS13B mutations and Cohen syndrome. VPS13B is a widely expressed transmembrane protein, and on the basis of its homology to the yeast VPS13 protein, it is thought to function in vesicle-mediated transport and sorting of proteins within the cell. By forming a complex with the small guanosine triphosphatase RAB6 at the Golgi, VPS13B strongly colocalizes with the cis-Golgi matrix protein GM130 and is required to maintain Golgi integrity (46). VPS13B also plays a role in Golgi protein glycosylation (47) and has been identified as a tethering factor functioning in transport from early endosomes to recycling endosomes (47). Facial dysmorphism is a common feature of disorders of glycosylation (48) and likely reflects the importance of glycosylation as a modulator of many extracellular signaling pathways that are highly active during embryogenesis.

The observation of an effect of Vps13b on morphology in outbred mice suggests that variation in this gene could play a role in natural craniofacial variation in Mus as well as Homo. In this context, it seems relevant that the VPS13B region has been highlighted in genome scans for selective sweeps both in dog breeds (49) and in comparisons of Neanderthal and modern human sequences (50). This suggests that certain gene regions affecting facial variation within species could also have played a role in craniofacial differentiation across species. This appears to be the case for the SUPT3H/RUNX2 region. We initially detected an association of SNPs in this gene region with nose bridge breadth (15). Subsequently, an effect for SNPs in SUPT3H/RUNX2 on nose morphology (10) and chin dimples (7) has been reported, and here we find an association of this region with forehead protrusion. The ratio of glutamines to alanines in a functional domain of RUNX2 was shown to correlate with facial length in Carnivora (51), and this glutamine/alanine ratio has recently been shown to also correlate with facial length in anthropoid primates (52). It will therefore be of interest to more broadly examine the extent to which the loci being implicated in human facial variation could play a role in natural variation in other species, as well as in the differentiation of facial features between taxa. Of particular interest for our species is the role that these genes could have played in the differentiation of anatomically modern humans from more archaic forms, which might also provide insights into the evolutionary forces involved.


Study subjects

We used CANDELA consortium (13) data from 6282 volunteers recruited in five Latin American countries (Brazil, 630; Chile, 1794; Colombia, 1419; Mexico, 1265; Peru, 1174). All volunteers provided written informed consent. Ethics approvals were obtained from Universidad Nacional Autónoma de México (México), Universidad de Antioquia (Colombia), Universidad Perúana Cayetano Heredia (Perú), Universidad de Tarapacá (Chile), Universidade Federal do Rio Grande do Sul (Brazil), and University College London (UK).


On each of the 6282 right side face profile 2D digital photographs (13), we manually placed 19 landmarks [commonly used in face morphology studies (53)] and 22 semi-landmarks (fig. S1) using TpsDig (54). Then, we slid the semi-landmarks using TpsUtil and TpsRelw (54) and performed a full Procrustes superimposition using MorphoJ (55). From those 41 Procrustes-adjusted landmarks and semi-landmarks, we obtained 59 facial measurements using MATLAB 9.6.0 (table S1). The measurements include distances, angles, and ratios, on the overall face and on specific regions of the upper, middle, or lower face, including features of the nose, lips, and eyes. The measurement protocol used was devised building on our previous analyses (4) and seeks to include features reported to be variable in human populations (56). While GWAS of facial landmarks commonly analyze all pairwise distances (13, 6, 18), or linear combinations of landmark coordinates such as PCs (10), the measurements obtained here are akin to those commonly used in certain anthropological, medical, or forensic applications (57). Broadly, 35 features were measured according to one or more definitions (up to five for jaw protrusion) and covering either the whole profile (face flatness) or a specific region of the face: forehead (4 measurements), ear (inclination and size), eye (4 measurements), nose (18 measurements), lips and philtrum (15 measurements), jaw (8 measurements), and chin (3 measurements). We assessed the reliability of our phenotyping protocol by calculating concordance correlation coefficient (CCC) for all landmarks, semi-landmarks, and measurements (58). For four pairs of series, the CCC was computed as 2×S12S12+S22+(Y1Y2)2, where S12 is the covariance between the two series of that pair and Si2 and Yi are, respectively, the variance and the average of the ith series of that pair. The pairs of series were INTER (inter-observer: two operators landmarked the same 200 photographs), INTRA 1 (intra-observer 1: operator P.J. landmarked twice the same 50 photographs 2 weeks apart), INTRA 2 (intra-observer 2: operator B.B. landmarked twice the same 100 photographs 2 weeks apart), and ALT (alternative picture: operator B.B. landmarked two different photographs of the same 100 individuals 2 weeks apart). All CCC results are given in table S2.

DNA genotyping and quality control

DNA samples from participants were genotyped on the Illumina HumanOmniExpress chip including 730,525 SNPs. PLINK v1.9 (24) was used to exclude SNPs and individuals with >5% missing data, markers with minor-allele frequency < 1%, monozygous twins or sample duplicates [PLINK Identity-By-Descent (IBD) estimate > 0.95], and those who failed the X or Y chromosome sex concordance check (sex estimated from X chromosome heterozygosity or Y chromosome call rate not matching recorded sex information). After applying these filters, 636,195 autosomal SNPs and 7010 individuals (1853 from Colombia, 693 from Brazil, 1891 from Chile, 1288 from Mexico, and 1285 from Peru) were retained for further analysis. Because of the admixed nature of the study sample, there is an inflation in Hardy-Weinberg P values (4). We therefore did not exclude markers based on Hardy-Weinberg deviation but performed stringent quality controls at software and biological levels (4). For example, we checked for batch effects by genotyping a control sample on each plate and comparing its genotype calls across batches; the consistency rate (i.e., matched proportion of genotypes) was ≥0.999 in all cases after SNP-level QC. We also assessed the accuracy of our genotypes independently by comparing against genotype calls obtained on a subset of samples by high-coverage sequencing, and consistency was again ≥0.999.

SNP genotype imputation

The chip genotype data of the CANDELA sample were phased using SHAPEIT2 (59). IMPUTE2 (60) was then used to impute genotypes at untyped SNPs using variant positions from the 1000 Genomes Phase 3 data, which include haplotype information for 2504 individuals across the world at 77,818,332 autosomal variant positions. Positions that are monomorphic in 1000 Genomes Latin American samples [CLM (Colombian in Medellin), MXL (Mexican in Los Angeles), PEL (Peruvian in Lima), and PUR (Puerto Rican from Puerto Rico)] were excluded, leading to 11,218,392 autosomal biallelic variants being imputed in our dataset. Of these, we removed 64,681 SNPs based on the following criteria that may indicate poor imputation and/or genotyping: (i) imputation quality scores < 0.4, (ii) low concordance value (<0.7) with chip genotypes, or (iii) a large gap between info [a cross-validation metric implemented in IMPUTE2 (60), indicating imputation certainty] and concordance values (info_type0-concord_type0 > 0.1). In the latter two cases, these SNPs were also excluded from the chip dataset. The IMPUTE2 (60) genotype probabilities at each locus were converted into most probable genotypes using PLINK (24) (at the default setting of >0.9 certainty). SNPs with a proportion of samples with uncalled genotypes > 5% and minor-allele frequency < 1% were excluded. The final imputed dataset contained genotypes for 8,780,546 autosomal biallelic SNPs (638,266 genotyped SNPs and 8,142,280 imputed SNPs).

We have estimated the imputation accuracy of the final dataset using the median info score. The overall median info score was 99.2%. It is slightly higher (99.5%) for “European” variants (defined as variants with MAF (Minor Allele Frequency) > 20% in highly European individuals and MAF < 5% in highly Native American individuals) as well as for “Native American” variants (99.3%). For the sake of comparison, the rare variants (MAF < 5% in CANDELA; ~30% of the imputed dataset) have a median info score of 97.4%. We have therefore concluded that imputation was as reliable for Native American variants as for European variants and was high overall for the entire imputed dataset including rare variants as observed from the median values. (As SNPs with MAF < 1% in the AMR reference population or in the CANDELA samples are excluded, we consider a threshold of MAF < 5% for “rare” SNPs.)

In addition, we sought to verify the quality of imputed genotypes by comparing with an independently sequenced dataset. A set of Native American samples collected for a previous study (14) was sequenced at high coverage and variants were filtered. We calculated the concordance for these samples as the proportion of imputed genotypes that match the sequence data exactly. The overall median concordance was high (99.4%) and remained high in subcategories: 98.9% for SNPs common in Native American samples (MAF > 5% in that population) and 99.8% for rare SNPs in Native American samples (MAF < 5%).

Sample filtering

For all traits together, we initially performed the following data quality control on volunteers with available genotype data regardless of the studied trait. We discarded from the dataset any volunteer (i) with missing age information, (ii) with discordant information between recorded age and age computed based on birth date, (iii) older than 45 years, or (iv) younger than 18 years. Also, as the African ancestry is underrepresented in our population, we discarded from the dataset 23 individuals whose African continental ancestry share (computed using Admixture) (61) was higher than either their Native American or European shares.

The filtering of individuals was then performed for each trait independently to maximize the number of samples per trait (a global filtering, e.g., on phenotype outliers, would discard 942 individuals from the dataset). To that end, we first removed from the dataset any individual presenting an outlier phenotype (trait value lower or greater than the trait value average for that sex plus or minus three times the SD).

Then, among the remaining individuals for that trait, we identified highly related pairs and removed one member of each pair. The threshold for highly related pairs was set to an IBD probability estimate equal to 0.1, to exclude third-degree relatives and higher. The threshold, however, needed adjustment because of inflation in IBD estimates due to the admixed nature of the samples. In our previous study (4), we observed that highly Native samples have higher IBD values due to drift and inbreeding in the founder populations: Among unrelated CANDELA samples with >95% European ancestry, median IBD is <0.01, but among samples with >95% Native ancestry, median IBD value is ~0.2. Therefore, the actual IBD threshold was adjusted according to the ethnic composition of the participants, so that it is always 0.1 above the expected IBD level between unrelated samples of that ethnic composition. For example, that threshold was relaxed to 0.30 (0.1 + 0.2) for the specific case of a large (~240 individuals) genetic group mainly composed of Peruvian and Chilean samples and of very high Native American ancestry (average Native American ancestry is equal to 94%).

Once highly related pairs are identified, we recursively removed individuals involved in multiple pairs, starting with those involved in many pairs, and update the count until each individual is included in at most one pair. For these remaining pairs, we removed the sample with greater age deviation from the average sampling age.

Genetic PCs were computed on the resulting set of individuals for each trait. Among these individuals, genetic outliers were identified by computing, for each sample, the distance to the nearest neighbor based on the 10 leading PCs [obtained with PLINK (24)], with the ith PC weighted by λiλ10. Samples whose distance to nearest-neighbor distance was greater than the average distance plus three times the SD were discarded as genetic outliers.

Such sample filtering was applied independently for each of the 59 phenotypes, yielding similar numbers of samples qualified for downstream analyses (from 5686 to 5731; median, 5719). Some of these analyses use the set of all samples that were qualified at least for one trait (5764 samples).

Statistical genetic analyses

Primary GWA was implemented using PLINK v1.9 (24). For each phenotypic trait independently, we performed a multiple linear regression analysis with sex, age, landmarking operator, BMI, and the six leading genetic PCs (computed on the samples qualified for this trait) as covariates. The association analyses were tested on the imputed data. Q-Q plots for all traits showed no sign of inflation, the genomic control factor λ being ~1 in all cases, indicating that our study performs appropriate subject QC, such as exclusion of relatives and outliers, and appropriately controls for population stratification.

To account for multiple testing across all the traits, we estimated the FDR threshold with the Benjamini-Hochberg procedure (25). Applying the FDR procedure on all 59 traits and all tested SNPs (i.e., a total of 507,440,483 tests), the adjusted genome-wide significance threshold was 6.72 × 10−7 at α = 5% and 7.08 × 10−8 at α = 1%. Both P value thresholds were larger than the commonly used GWAS genome-wide significance threshold of 5 × 10−8, indicating that this threshold of 5 × 10−8 is more stringent than the FDR significance thresholds adjusted for all traits, and therefore, its use controls for testing multiple traits. Therefore, we used the commonly threshold of 5 × 10−8 across all traits.

To estimate narrow-sense heritability (h2) and genetic correlations between traits, we have computed a genomic relationship matrix gathering the 5764 samples qualified for at least one trait, on 634,753 genotyped SNPs (MAF > 1%, autosomes only) using LDAK5 (22) with default parameters (option --power set to −0.25). For each of the 59 traits, we further used the submatrix gathering all samples qualified for this specific trait. We have obtained h2 by partitioning the phenotypic variance into additive genetic (σg2) and residual (σe2) variances using REML in LDAK5 (22) on a linear mixed model y ~ sex + age + BMI + rater + g + e, where sex, age, BMI, and rater are fixed effects and g and e are random (respectively additive genetic and the residual) effects. Testing h2 = 0 was done by fitting the same model with GCTA (Genome-wide Complex Trait Analysis software) (62). We have used the same tools on a bivariate model (with the same terms) to estimate the genetic correlation (ρij=σij/σgi2σgj2) for each pair of traits.

Cranioskeletal analysis of Vps13btm1.Ics KO mice

Mice were bred and housed at the animal facility of the University of Burgundy, and animal procedures were approved by the local ethics committee under the reference number 821231010. A total of 51 mice, balanced between genders and genotypes, were euthanized at 18.7 ± 1.2 weeks old. Skulls were cleaned with dermestid beetles and then scanned at 29 μm using a Skyscan 1174 microCT. Brain removal before scanning damaged some specimens, and they were not included further in the study. Prior analyses reveal three additional gross outliers that were removed from the data. A final sample size of 14 females (9 WT and 5 KO) and 23 males (12 WT and 11 KO) was retained for the study.

Mesh surfaces of each skull were obtained in Avizo 2019 using a Marching cube algorithm. A set of 11 paired landmarks and 4 unpaired landmarks on the face and the palate was digitized on these meshes using the R package digit3DLand version 0.1.5. Seven additional curve paired semi-landmarks and five unpaired curve semi-landmarks were also digitized (Supplementary Notes and fig. S3-1). Sliding of these curve semi-landmarks on the 3D surface of the skulls was performed using the R package Morpho 2.6. Before sliding, meshes were decimated to 500,000 faces and smoothed using a surface preserving Laplace algorithm in the R package Rvcg version 0.98. The final set of 44 landmarks and semi-landmarks was superimposed using a full generalized Procrustes analysis with bilateral object symmetry (63) in the Morpho package.

A linear model accounting for centroid size as well as genotype by sex effects was fitted on the symmetric shapes. Mutant mice were generally smaller and lighter than their WT counterparts (Supplementary Notes). Therefore, skull centroid size was included as a covariate in the modeling of shape variation to control for size-related effects. Since sample size was small compared to the number of variables, phenotypic effects were evaluated with Procrustes analysis of variance (ANOVA) (64) computed with the R package geomorph version 3.1.3 and processed with 1000 residual permutations. Shape changes were visualized as shape differences between expected marginal genotype means predicted at the population average centroid size for WT and KO and visualized with a twofold amplification.

Analysis of genetic association in outbred mice

We reanalyzed the data of Pallares et al. (39), which include genotypes for 70,000 SNPs obtained from 692 mice and raw 3D coordinate data for 44 landmarks (17 pairs of symmetric landmarks and 10 landmarks on the median plane). Craniofacial variation was modeled only on the basis of the 67 non-null PCs obtained from the 44 × 3 symmetric tangent coordinates (which include 51 dimensions redundant due to symmetry) (63).

Association between SNP genotype and the set of 67 PCs representing skull shape was evaluated with a multivariate mixed model, incorporating centroid size as covariate. Population structure was modeled from the genomic relatedness matrix (65). The variance components estimated from the model were used to correct both genotypes and phenotypes (66). Association was tested on the basis of Pillai trace statistics obtained from the multivariate regression between the corrected genotypes and PC scores. FDR was computed on the basis of 100 permutations of corrected PC scores following the approach of Nicod et al. (66) and used to identify SNPs exceeding an FDR threshold of 5%.

Regulatory annotation enrichment analysis

We extracted chromatin enhancer annotations (ChromHMM states 13 to 18) identified in 127 non-craniofacial tissues analyzed by the Roadmap Epigenomics Consortium (67), as well as those obtained in 24 primary human embryonic craniofacial tissues, 4.5 to 8 post-conception weeks, and two samples derived from cultured HEPM (human embryonic palatal mesenchyme) cells (27). We also extracted the superenhancer annotations reported for craniofacial tissue using the ROSE (Ranking Of Super Enhancer) approach (27, 44). The SNPs included in the GWAS were filtered for association P values < 5 × 10−8, and their enrichment in enhancer annotations was evaluated using the approach implemented in GREGOR (28). Testing of each associated SNP included a minimum of 500 neighboring SNPs with r2 ≥ 0.8, in windows up to 1 Mb [based on 1000 Genome Project European data (23)].

Selection signal near WARS2-TBX15

To test for signals of selection at the WARS2-TBX15 region in Native Americans, we computed PBS scores (68) using CANDELA individuals with more than 95% Native American ancestry (127 samples). As PBS in an FST-based statistic, we inferred allele frequencies by accounting for non-Native American ancestry. To do so, we used the RFMix software (69) to assign local ancestry at each SNP. Three continental populations consisting of phased haplotypes for 175 Native Americans, 107 IBS (Iberian Population in Spain; 1000 Genomes Project Phase 3), and 101 YRI (Yoruba in Ibadan, Nigeria; 1000 Genomes Project Phase 3) were used as reference samples. We ran RFMix (69) with the phase correction feature enabled and performed two rounds of the EM (expectation maximization) algorithm. We used the default window size of 0.2 cM, the default number of trees (100), and the default model of admixture occurring eight generations ago. Genetic distances were obtained from the HapMap Phase II genetic map. For each SNP, we then computed Native American allele frequency fNat asfNat=fNatCANDELAfEurEurfAfrAfr1EurAfrwhere fNatCANDELA is the allele frequency for CANDELA individuals with >95% Native American ancestry, fEur is the allele frequency for European individuals, and fAfr is the allele frequency of sub-Saharan African individuals. The ancestry proportion for each reference populations was estimated asAnc=i2NP(Anc)i/2Nwhere i is a haplotype at each site and P(Anc) is the posterior probability of assignment of a given ancestry. Using these estimated allele frequencies, we computed PBS scores for Native Americans using CEU (Northern Europeans from Utah, USA; 1000 Genomes Project Phase 3) and CHB (Han Chinese in Beijing, China; 1000 Genomes Project Phase 3) as reference populations. SNPs not polymorphic in at least two populations were excluded.

Detection of Denisovan introgression near WARS2-TBX15 and association with face profile traits

The imputed genotypes of the 7026 CANDELA samples were first phased in a 3-Mb window around the WARS2-TBX15 region (chromosome 1, from 118 to 121 Mb) using SHAPEIT2 (59) with default parameters and no reference panel. We have retained 2383 biallelic SNPs among the 9582 variants in the central megabase of that window, complying with the following rules: (i) those with read depth greater than or equal to 20 in Homo denisovensis (downloaded from, (ii) those with filter PASS in the VCFs of Denisova and 1000 Genomes sequences, (iii) those for which ancestral allele information was consistent throughout VCFs (Variant Call Formats), (iv) those for which ancestral allele is not different from the two reported alleles in modern or archaic samples, (v) those that have the same polymorphism reported in modern and both archaic samples, and (vi) those that were located within the 119 to 120 Mb. This selection contained the leading SNP for our GWAS hit in that region (rs3790553) as well as three other close SNPs that were found to be significantly associated with facial traits in former studies: rs1766786 [associated with chin dimples in Pickrell et al. (7)], rs72691108 [associated with frontal bossing phenotype in Claes et al. (10)], and rs17023457 [associated with two ear phenotypes—folding of antihelix and antitragus size—in Adhikari et al. (15)]. We ran the same hidden Markov model as in a previous study of Greenlandic Inuit (admixtureHMM) (31) on the 14,052 CANDELA chromosomes after coding them according to the human-chimp ancestral allele (0/1 for ancestral/derived allele, following the authors’ procedure). The purpose of admixtureHMM is to detect segments of the test chromosomes that are more likely to originate from archaic than modern humans. We used the allele frequencies of the Denisovan sample and, as proxy for modern humans, those of the 108 YRI samples from the 1000 Genomes Phase 3. We ran the admixtureHMM with default parameters, except for confidence level for which we have tested four values (20, 50, 90, and 99%). The putative introgression tracts that were called at each of these thresholds and with a minor allele frequency ≥ 1% were further considered for association analyses. We have then tested for association between the genotypes (0, 1, or 2—the number of copies of an introgression tract) and the 59 traits using the same linear model as for GWA tests (see the “Statistical genetic analyses” section), hence accounting for sex, age, BMI, landmarking operator, and six genetic PCs.


For HaploReg, see For NCBI, see For UCSC (University of California Santa Cruz Genome Browser), see For Ensembl, see For GTEx, see For GWAS catalog, see For Denisovan sample, see

For the R package FastMan used to draw the Manhattan plot in Fig. 1, see For the R package digit3DLand version 0.1.5, see For the R package Morpho 2.6, see For the R package Rvcg version 0.98, see For the R package geomorph version 3.1.3, see


Supplementary material for this article is available at

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank the volunteers for their enthusiastic support for this research. We also thank A. Alvarado, M. Ballesteros Romero, R. Cebrecos, M. Á. Contreras Sieck, F. de Ávila Becerril, J. De la Piedra, M. Teresa Del Solar, W. Flores, M. Granados Riveros, R. Paim, R. Gunski, J. Felisberto Menezes Cavalheiro, E. Correa de Souza Jr., W. Hart, I. Jafet Moreno, P. León-Mimila, F. Quispealaya, D. Rogel Diaz, R. Rojas, and V. Sarabia for assistance with volunteer recruitment, sample processing, and data entry. We are very grateful to the institutions that allowed the use of their facilities for the assessment of volunteers, including Escuela Nacional de Antropología e Historia and Universidad Nacional Autónoma de México (México), Universidade Federal do Rio Grande do Sul (Brazil), 13° Companhia de Comunicações Mecanizada do Exército Brasileiro (Brazil), Pontificia Universidad Católica del Perú, and Universidad de Lima and Universidad Nacional Mayor de San Marcos (Perú). We also acknowledge the Centre de Calcul Intensif d’Aix-Marseille for granting access to high-performance computing resources and the Gismo analytical platform of the University of Burgundy for scanning the mouse skulls. We thank M. Fumagalli and Q. Li for comments on the manuscript and D. Speed for advice on heritability calculations. We acknowledge M. A. Ikram, T. E. C. Nijsten, M. A. de Jong, S. Boehringer, M. K. Lee, E. Feingold, M. L. Marazita, L. Paternoster, H. Thompson, G. C. Sharp, S. Lewis, S. Richmond, A. Zhurov, and L. Pallares for facilitating access to published datasets. We are grateful to E. Bellini for the face drawing incorporated in Fig. 1. We thank L. Pallares and A. Palmer for sharing the GWA mouse data. Funding: Work leading to this publication was funded by grants from the National Natural Science Foundation of China (#31771393), the Scientific and Technology Committee of Shanghai Municipality (18490750300), the Ministry of Science and Technology of China (2020YFE0201600), the Shanghai Municipal Science and Technology Major Project (2017SHZDZX01) and the 111 Project (B13016), the Leverhulme Trust (F/07 134/DF), BBSRC (BB/I021213/1), the Excellence Initiative of Aix-Marseille University–A*MIDEX (a French “Investissements d’Avenir” program), Universidad de Antioquia (CODI sostenibilidad de grupos 2013-2014 and MASO 2013-2014), Conselho Nacional de Desenvolvimento Científico e Tecnológico, Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul (Apoio a Núcleos de Excelência Program), Fundação de Aperfeiçoamento de Pessoal de Nível Superior, the National Institute of Dental and Craniofacial Research (R01-DE027023, U01-DE020078, R01-DE016148, and X01-HG007821), and a Santander Research and Scholarship Award. B.B. is supported by a doctoral scholarship from Ecole Doctorale 251 Aix-Marseille Université. Author contributions: B.B., C.C.S.d.C., C.G., C.J., C.M., E.W., F.R., G.B., G.P., H.V.-R., J.C., J.G.-V., L.P., L.S.-F., M.-C.B., M.D., M.F.-G., M.H., P.E.-M., P.J., R.B., R.G.-J., S.C.-Q., T.H., V.A.-A., V.G., V.V., V.R., and W.A. contributed to volunteer recruitment or data collection. B.B., C.V.-G., J.M.-R., K.A., N.N., P.F., S.P., and Z.X. performed analyses. B.Y., D.B., K.A., L.D., M.K., N.N., R.G.-J., and T.C. provided guidance on aspects of study design. A.R.-L., C.C., C.T.-R., D.B., L.F., F.L., S.M.W., J. R.S., E.S., L.J.H., P.G.H., T.D.S., and M.K. obtained funding or granted access to resources. A.R.-L. and D.B. designed the project. A.R.-L., B.B., K.A., J.C., N.N. and P.F. wrote the paper with input from coauthors. A.R.-L. coordinated the study. Competing interests: The authors declare that they have no competing interests. Data and materials availability: Raw genotype or phenotype data cannot be made available due to restrictions imposed by the ethics approval. Summary statistics obtained here have been deposited at GWAS central and can be accessed with the link: (available from the Spring 2021 release).

Stay Connected to Science Advances

Navigate This Article