Research ArticleGENETICS

Genome-wide association study in almost 195,000 individuals identifies 50 previously unidentified genetic loci for eye color

See allHide authors and affiliations

Science Advances  10 Mar 2021:
Vol. 7, no. 11, eabd1239
DOI: 10.1126/sciadv.abd1239

Abstract

Human eye color is highly heritable, but its genetic architecture is not yet fully understood. We report the results of the largest genome-wide association study for eye color to date, involving up to 192,986 European participants from 10 populations. We identify 124 independent associations arising from 61 discrete genomic regions, including 50 previously unidentified. We find evidence for genes involved in melanin pigmentation, but we also find associations with genes involved in iris morphology and structure. Further analyses in 1636 Asian participants from two populations suggest that iris pigmentation variation in Asians is genetically similar to Europeans, albeit with smaller effect sizes. Our findings collectively explain 53.2% (95% confidence interval, 45.4 to 61.0%) of eye color variation using common single-nucleotide polymorphisms. Overall, our study outcomes demonstrate that the genetic complexity of human eye color considerably exceeds previous knowledge and expectations, highlighting eye color as a genetically highly complex human trait.

INTRODUCTION

Eye color is primarily determined by melanin abundance within the iris pigment epithelium, which is greater in brown than in blue eyes (1), and both the density and distribution of stromal melanocyte cells (2). Ratios of the two forms of melanin, eumelanin and pheomelanin, within the iris as well as light absorption and scattering by extracellular components (Tyndall scattering) are additional factors that give irises their color (3). Absolute melanin quantity and the eumelanin:pheomelanin ratio are higher in brown irises (4), while blue or green irises have very little of both pigments and relatively more pheomelanin.

European populations, or those with partial European origin, display the largest diversity of iris color, varying from the lightest blue to darkest brown. The prevalence of blue eyes correlates with geographic latitude across Europe and neighboring areas (5), likely as a result of human migration, sexual, and possibly natural selection (68). Similarly, eye color variation with varying degrees of brown irises is seen in Asian populations (9), although with a much reduced range compared to brown eye color variation in Europeans.

Iris color is highly heritable (10). Previous genome-wide association studies (GWASs) have identified various single-nucleotide polymorphisms (SNPs) in and around 10 genes significantly associated with eye color (1114), highlighting the polygenic nature of a trait that, in the past, was assumed to be genetically simple (15). The strongest genetic influence on eye color is exerted by the neighboring HERC2 and OCA2 genes (13, 14, 16, 17), where a long distance enhancer effect of an intronic SNP in HERC2 was demonstrated interacting with the OCA2 promoter functioning as a molecular switch between light and dark pigmentation (18). Previously available genetic knowledge allows accurate prediction of blue and brown eye color, for instance, with a DNA test system based on six SNPs from six genes, including HERC2 and OCA2 (19), that has been used in anthropological (20) and forensic (21, 22) applications. However, nonblue and nonbrown eye color can be genetically predicted with considerably lower accuracy (19, 21, 22), likely because of unknown predictive SNPs and responsible genes. Moreover, the phenotypic variance of eye color not previously explained by GWASs ranged between 26% for a blue versus a brown scale (10) and 50% for a scale using three categories (11) across studies. This illustrates the likely presence of yet unknown genes responsible for the noted missing genetic predicting accuracy (19, 21, 22) and missing heritability of human eye color (23).

To overcome these limitations and better understand the genetics of human eye color, we carried out the largest eye color GWAS to date. Our study involved 157,485 individuals of European ancestry in the discovery stage and an additional 35,501 ancestral European individuals in the replication stage, as well as 1636 Asians (of Han Chinese and Indian ancestry) additionally used for replication purposes, in total 194,622 individuals of different ancestries.

RESULTS

Linear regression results were obtained for 11,532,091 SNPs from 157,485 individuals of European ancestry in the discovery dataset from the personal genetics company 23andMe Inc. The Devlin’s genomic factor (24) was λ = 1.13, while a linkage disequilibrium (LD) score regression intercept of 1.095 with an [(intercept − 1)/(mean(χ2) − 1)] ratio of 0.19 was obtained, which is consistent with expectations based on the large sample size and polygenic architecture (25). In total, 12,192 SNPs were associated at genome-wide significance (P < 5 × 10−8) with eye color. These SNPs clustered in 52 distinct genomic regions, 50 across the autosomes and 2 on chromosome X (Table 1 and Fig. 1). While confirming the 10 genomic regions previously associated with human eye color (11, 26, 27), the remaining 42 genomic regions identified represent novel discoveries.

Table 1 Strongest associated SNPs from the 52 genomic regions independently associated with eye color in the European discovery cohort (N = 157,485).

A full list of 115 independently associated SNPs from conditional analysis at these loci can be found in table ST1. SNPs with novel associations for eye color identified here are highlighted in bold. SNPs that were associated with other pigmentation traits, but not iris color, in previous studies are indicated by an asterisk (*). Chr is the chromosome for the given SNP, genome position (Pos) refer to the HG build 37, RS is the rsid for the given SNP, Freq is the allele frequency of the stated reference allele, Beta is the effect size for the stated reference allele, SE is the respective standard error for the beta, Alt allele is the alternative allele for the given SNP, and N is the sample size tested for the respective SNP.

View this table:
Fig. 1 Manhattan plot of eye color GWAS results in the European discovery cohort (N = 157,485).

All results with P values <5 × 10−8 are indicated in red.

As expected, many of the strongest associations were observed for SNPs within HERC2 (rs1129038, P < 10−330), the strongest eye color–associated region known previously, and genes involved in five of the seven known types of oculocutaneous albinism (OCA): OCA1 (TYR, rs1126809, P = 1.8 × 10−255), OCA2 (OCA2, rs1800407, P < 10−330), OCA3 (TYRP1, rs13297008; P = 5.0 × 10−250), OCA4 (SLC45A2, rs16891982; P = 2.0 × 10−255), and OCA6 (SLC24A5, rs1426654; P = 2.7 × 10−26). HERC2, TYR, OCA2, TYRP1, and SLC45A2 have previously been associated with eye color (13, 14, 16, 17, 19), while previous studies found association with hair and skin pigmentation for SLC24A5 (27) and eye color in a South Asian population (28), but not with eye color in Europeans as we showed here. No significant eye color associations were found for either the OCA5 (the 4q24 region) or the OCA7 (C10ORF11) locus, the other two genes involved in OCA, in our large discovery sample; however, C10ORF11 has recently been associated with human eyebrow color (29). The remaining four previously reported loci (11) were also strongly associated with eye color in the present study: LYST (rs2385028, P = 2.4 × 10−12), IRF4 (rs12203592, P = 1.6 × 10−321), SLC24A4 (rs17184180, P = 1.3 × 10−284), TSPAN10 (previously reported as NPLOC4, rs6420484; P = 7.5 × 10−90), and TTC3 (rs2835660, P = 1.6 × 10−17).

Among the 41 novel genetic loci identified in our discovery analysis, we detected significant eye color association for SNPs within TPCN2 (rs72928978, P = 6.42 × 10−32) and MITF (rs75114713, P = 8.11 × 10−24). These two genetic loci and five others (near the DTL, AP3M2, SOX5, DCT, and SIK1 genes; Table 1) were associated with hair and skin pigmentation in previous GWASs (highlighted with asterisk in Table 1) (27, 3032), while their eye color association is reported here. A comparison of our results with those from the GWAS Catalog (33) revealed that 34 (81%) of the 41 novel eye color–associated loci are unique to eye color and not shared with other pigmentation traits (Table 1), which is 36 (69%) when considering all 52 eye color–associated loci we identified here including the 11 previously known loci. The unique novel eye color loci include SNPs in TRAF3IP1 (rs74409360, P = 8.1 × 10−20) and SEMA3A (rs6944702, P = 1.7 × 10−10). TRAF3IP1 has been previously associated with iris furrows, while SEMA3A was associated with iris crypt variation (3). These findings suggest that eye color may, at least in part considered by the eye color phenotyping done here, be mediated through structural effects within the iris. Significant association was also observed for SNPs within HIVEP3 (rs6696511, P = 1.3 × 10−45), which was previously associated with refractive error and myopia (34). Novel eye color association within PPARGC1A (rs4521336, P = 1.3 × 10−9) and within MAP2K6 (rs3809761, P = 5.1 × 10−9) likely arises from their participation in pigmentation pathways, the PGC1-α protein, the product of the PPARGC1A gene. PGC1-α activates the MITF promoter (35) acting as a regulator in the pigment pathway of the tanning response, while overexpression of MAP2K6 increases melanocyte dendricity (36), allowing greater transportation of melanosomes. We report here eye color associations for genetic loci located on the X chromosome. DNA variants clustering around the SSX1 (rs78542430, P = 5.93 × 10−53) and TMEM255A (rs5957354, P = 5.43 × 10−23) genes were significantly associated with eye color. Little is published about the TMEM255A and SSX1 genes and their potential functional roles relating to the eye, and further studies are required to understand the role of these genes in iris pigmentation. Notably, a recently published large GWAS on hair color in Europeans was the first to discover X-chromosomal loci to be involved in human pigmentation (31), albeit with another gene (COL46A, 12 Mbp upstream from nearest associated locus in our study) than we found to be associated with eye color here. Notably, SNPs in or near the MC1R gene, known to be involved in light skin and red hair color, did not show eye color association in our study.

Next, we conducted a conditional analysis to identify SNPs associated with eye color after adjusting for the effect of the main association signal at a respective locus. This analysis highlighted 115 conditionally associated SNPs distributed across the novel and previously known 52 genomic regions (table ST1; a secondary list of 489 high-quality control (QC) scoring, significantly associated SNPs is provided in table ST2) that improve the phenotypic variability explained by genetic factors compared to the lead SNPs. Of the 115 independently associated SNPs, 9 were missense mutations (table ST3) (37), which may suggest their causal role.

Using additional independent data from 35,501 individuals of European ancestry enrolled in nine different studies collected by the International Visible Trait Genetics (VisiGen) Consortium and its study partners, we sought to replicate the findings from the discovery analysis. For this, we tested the strongest associated SNP for each of the 52 regions (lead SNPs) identified in the discovery GWAS. To minimize data fragmentation and therefore maximize statistical power, results of regression analyses from each of the nine VisiGen studies were meta-analyzed together, and the outcome of this meta-analysis was used to replicate the discovery stage findings. Because of the differences in sample size, genotyping platforms, and imputation methods across the VisiGen studies, several SNPs highlighted by the discovery GWAS were not available or did not pass quality control in the replication dataset, particularly variants of low minor allele frequency (MAF). Some VisiGen studies had no availability of X chromosome data. Therefore, replication was attempted for 48 lead SNPs from 50 autosomal regions highlighted in the discovery analysis (table ST4). Overall, for 47 (98%) of the 48 lead SNPs, the direction of effect was the same in the replication analysis as previously seen in the discovery analysis (Fig. 2 and table ST4). For 27 (56%) lead SNPs, we obtained significant replication after applying a strict Bonferroni adjustment for multiple testing (0.05/48 = 1.04 × 10−3), including 18 (47%) of the 38 novel lead SNPs. An additional nine (24%) of the novel lead SNPs were associated at nominal (P < 0.05) significance. There was minimal heterogeneity between cohorts for these SNPs; a histogram of the heterogeneity I-scores is provided in fig. S1.

Fig. 2 Comparison of SNP beta and Z scores between the European discovery (23andMe, N = 157,485) and the European replication (VisiGen, N = 35,501) cohorts.

Color codes represent significance in the replication cohort: red = P < 1.04 × 10−3; green = P < 0.05; blue = P > 0.05.

Results from both the discovery and replication studies were then meta-analyzed including a total of 192,986 Europeans from 10 populations. The Devlin’s genomic factor was relatively unchanged in this analysis (λ = 1.14). With the added statistical power, we found genome-wide significant associations for SNPs in an additional nine separate genomic regions (table ST5). All nine genetic loci were novel, not previously associated with human eye color. Three of these, PDE4D (rs62370541, P = 5.8 × 10−9), JAZF1 (rs849142, 3.0 × 10−9), and SOX6 (rs2351061, P = 4.0 × 10−8), have recently been associated with hair color (31). PDE4D inhibitors have been shown to increase melanin pigment in the skin of mouse models (38), as PDE4D is a target of the melanin-stimulating hormone/cyclic adenosine monophosphate (AMP)/melanocyte-inducing transcription factor (MITF) pathway (38).

Next, we compared results obtained from European subjects with association observations from a meta-analysis of 1636 Asian individuals (959 Han Chinese and 677 Indians from Singapore), for which quantitative measurements of the iris color were available (see the Supplementary Methods for information about quantitative phenotyping). Reliable SNP data were available for 44 of the lead autosomal SNPs from the 52 genetic loci identified by conditional analysis in the European discovery cohort. The remaining eight lead SNPs had a MAF smaller than 1% in this cohort and thus were excluded on account of the much smaller sample size. Thirty-one SNPs (70%) had the same direction of effect as the European analysis, and 5 (11%) of these 44 SNPs were significantly associated with eye color in Asians, after adjustment for multiple testing (Bonferroni-adjusted P value: 0.05/44 = 1.13 × 10−3). This included SNPs from two of the newly identified genes GPR157 (rs6693258, P = 3.6 × 10−5) and SIK1 (rs622330, P = 1.7 × 10−4) (table ST6), as well as from the previously known HERC2 gene (rs1129038, P = 2.2 × 10−4), which is the most strongly eye color–associated SNP in Europeans (17). For this marker, we observed considerable MAF differences between Europeans and Asians (T allele: 0.05 in Asians and 0.73 in Europeans). The strongest association in the Singaporean cohorts, however, was found within SLC24A5 (rs1426654, P = 1.9 × 10−7) representing the OCA type 6 (OCA6) locus. Notably, OCA6 was first reported in a Chinese family (39). However, the HERC2 and SLC24A5 SNPs were the only two variants to display heterogeneity between the two Singaporean cohorts (table ST6), with association primarily being driven by the cohort of South Asian ancestry. This is likely a due to South Asian and European populations being more closely related than East Asian and European populations. The MAFs for rs1426654 were considerably different between Asians (0.80 overall, 0.99 in East Asians, and 0.52 in South Asians) and Europeans (0.10) for the G allele, explaining why this SNP is often used as a DNA marker for biogeographic ancestry (40). Previously, SLC24A5 rs1426654 not only explained a considerable proportion of the variation in skin pigmentation between different continental populations (41) but also was associated with skin and eye color variation within a South Asian population (28, 42). Despite such drastic differences in allele frequency for some SNP alleles, the shared eye color effect between populations from two different continents demonstrates the value of our multiethnic study.

DNA variants identified via GWAS are markers of statistical association and not necessarily causative. Because of the differences in LD across continental populations, association may not be universally replicable. Therefore, we next assessed the presence of association not just for the lead SNPs in each region but also by testing other SNPs located within the 52 genomic regions identified in the European discovery analysis. Several SNPs within these regions (table ST7) showed evidence for eye color association in both European and Asian populations, with genome-wide significance in Europeans and suggestive levels of genome-wide association (P < 1 × 10−5) in Asians, despite a much smaller sample size. These SNPs also showed no significant heterogeneity between the two Asian cohorts. It is therefore possible that, against the background of much stronger effects of European-only alleles or due to population-specific differences in MAF, some polymorphisms contributing to European eye color variation are also relevant for eye color variability in non-European populations, such as variation of brown eye color in Asians tested here.

In Europeans, the 112 autosomal SNPs identified through conditional analysis (all autosomal SNPs shown in table S1) explained 99.96% (SE = 6.5%, P = 4.8 × 10−279) of the liability scale for blue eyes (against brown eyes) and 38.5% (SE = 5.7%, P = 2.2 × 10−130) for intermediate eyes in the TwinsUK cohort, which was one of the VisiGen cohorts used for replication. Using the same linear scale as the GWAS analysis, these autosomal SNPs explained 53.2% (SE = 4.0%, P = 1.2 × 10−322) of the total phenotypic variation in eye color in TwinsUK.

Last, we performed in silico analyses to explore the putative function of the genetic loci our study highlighted with significant eye color association using the conditional SNPs. Gene set enrichment analysis identified multiple pathways with significant enrichment (table ST8). As expected, this included several pigmentation process pathways, with “Developmental Pigmentation” the most significant (P = 7 × 10−6), followed by “Frizzled Binding” (P = 2.0 × 10−4) and “Melanin Metabolic Process” (P = 2.6 × 10−4). We also examined the potential effects of the identified eye color–associated SNPs on gene expression using data from the GTEx Consortium (43). Despite the lack of iris tissue in the GTEx repository, many SNPs showed significant eQTL (expression quantitative trait loci) effects in multiple cell types and tissues (44), as seen for the associated SNPs across 38 (79%) of the 48 tissues in the GTEx dataset (table ST9). Most of the strongest effects were seen in nerve and sun-exposed skin tissue (P = 8.47 × 10−62 and P = 4.84 × 10−49, respectively) for rs2835660, where the C allele is significantly associated with a decrease in TTC3 expression. TTC3 is in proximity to DSCR9, a gene whose polymorphisms were previously associated with eye color (11). These results implicate TTC3 as a more likely candidate gene influencing eye color at this genetic locus than DSCR9 and that its effect on eye color is likely mediated through variation in gene expression. The lack of iris tissue information in GTEx likely explains the absence of stronger eQTL effects, such as for SNPs with regulatory effects over gene transcription (16).

DISCUSSION

We report the results of the largest GWAS for human eye color to date. In addition to confirming the association of SNPs in 11 previously known eye color genes (11, 13, 14, 17, 28), the identification of 50 novel eye color–associated genetic loci helps explain previously missing heritability of eye color variability in European populations. Moreover, because of the multiethnic design of our study, we demonstrate that several of the genetic loci discovered in Europeans also have an effect on eye color in Asians.

Eight of the genes in or near the loci newly associated with eye color in our study were previously reported for genetic associations with other pigmentation traits, such as hair and skin color, for instance, TPCN2, MITF, and DCT (27, 30, 32, 45). The commonality of associated DNA variants across the three pigmentation traits helps explain why the different pigmentation traits frequently (but not completely) intercorrelate in European populations. While many significant genetic associations are shared between iris color and other pigmentation traits, there are also notable differences. Although DNA variants within the MC1R gene are strongly associated with light skin and red hair color (27), no detectable association with eye color was found in our large GWAS, in line with previous albeit smaller-sized GWASs of more limited statistical power (11, 12, 14). Similarly, other DNA variants strongly associated with skin and hair color within genes, such as SILV, ASIP, and POMC (30), showed no statistically significant effect on eye color in this study, nor in previous studies. Moreover, we also identified 34 genetic loci that were significantly associated with eye color, but for which there is no report of significant association with hair and/or skin color. This is remarkable as the statistical power of the recent GWASs on hair color (31, 46) and sun sensitivity (32) were similar to that of our current eye color GWAS. Significant associations for SNPs in/near genes involved in iris structure, such as TRAF3IP1 and SEMA3A, suggest that they exert their effects with changes in Tyndall scattering, rather than through alterations of melanin metabolism. Overall, this demonstrates that although many genes overlap between eye, hair, and skin color, the different human pigmentation traits are not completely determined by the same genes as we showed.

The major strengths of our study compared with previous eye color GWASs arise from the larger sample size, which translated into increased statistical power and also the ability to lower the threshold of MAF for which sufficient power to detect association is available. Rare SNPs are often a source of considerable phenotypic variation (47). For instance, seven (6%) of the independently associated SNPs identified by conditional analysis in the discovery cohort had a MAF between 0.1 and 1%. Despite their low frequency, however, five (71%) of these rare SNPs were in the same region as other, more common conditional SNPs that did replicate. The remaining two loci (DAB2 and an intronic region on chromosome 4) that were not formally replicated should therefore be considered only as strong candidates with respect to their association with eye color, pending independent validation in future studies.

Another strength of this work is the inclusion of European and non-European populations. Non-European populations are underrepresented in the GWAS literature in general, including in pigmentation GWASs, but their study is important for the understanding of the genetic basis of human phenotypes (48). Although eye color variation is typically attributed to individuals of (at least partial) European descent, or those originating from areas nearby Europe, more subtle variation in brown eyes is also observed in Asian populations without European admixture (9). Our results from the Asian cohorts showed remarkable consistence in the genetic architecture of eye color among individuals of different continental ancestries with Asian replication for the two major European genes OCA2 and HERC2. Moreover, our findings also suggest that while a single regulatory variant in HERC2 is responsible for most blue/brown variation in Europeans (16), many additional DNA variants across both OCA2 and HERC2 seem to have independent effects. This hypothesis is further supported by our conditional analysis in the European discovery cohort, identifying independent associations spanning ~14 mbp across both genes rather than a concentrated cluster centered at HERC2 rs1129038. This is remarkable given the large eye color variation from the lightest blue to the darkest brown in Europeans, compared with the more limited variation within brown eye color in Asians.

In conclusion, our work has identified numerous novel genetic loci associated with human eye color in Europeans, of which a subset also shows effects in Asians, despite their largely reduced phenotypic eye color variation compared with Europeans. The genetic loci we identified explain the majority (53.2%) of eye color phenotypic variation (classified using a three-category scale) in Europeans and a large proportion of the previously noted missing heritability of eye color. Our findings clearly demonstrate that eye color is a genetically highly complex human trait, similar to hair (31) and skin color (32), as highlighted recently in large European GWASs. The large number of novel eye color–associated genetic loci identified here provide a valuable resource for future functional studies, aiming to understand the molecular mechanisms that explain their eye color association, and for future genetic prediction studies, aiming to improve DNA-based eye color prediction in anthropological and forensic applications.

METHODS

We performed a GWAS using information from two sources: 157,485 research participants of European ancestry recruited among the customer base of the personal genomics company 23andMe Inc. (Sunnyvale, CA, USA) (49) and a meta-analysis of 35,501 European individuals from nine populations collected by members of the International Visible Trait Genetics (VisiGen) Consortium (45) and their study partners. As a secondary analysis and for comparative purposes, we included an additional set of 1636 individuals of Asian ancestry from two populations (959 Han Chinese and 677 Indians from Singapore).

Populations and participants

23andMe.
Participants

Participants included for this analysis were 23andMe consumers who consented to participate in research and were determined to have greater than 97% European ancestry using genotype clustering. A segmented identity-by-descent algorithm (50) was then applied to obtain the maximal subset (157,485) of unrelated individuals to be used as our discovery cohort (full details on ancestry and relatedness calculations are provided in the Supplementary Methods, and principal component plots are provided in fig. S2).

Genotyping

Participants were genotyped on one of 23andMe’s own custom-designed SNP arrays. The V4 platform is a fully customized array containing a subset of SNPs from previous versions, while the V3 is a customized platform based on the OmniExpress+ BeadChip, and the V1 and V2 platforms are customized variants of the Illumina HumanHap550+ BeadChip (full descriptions of platforms and genotyping procedure are provided in the Supplementary Methods). Imputation of additional variants was completed using 1000 Genomes phase 1 as reference haplotypes (51).

Phenotyping

Phenotyping for iris color in this cohort was derived from questionnaires. Participants were asked to self-categorize eye color into one of seven distinct groups ranging from blue to dark brown based on color matching (full details are provided in the Supplementary Methods). This categorization allowed greater phenotypic resolution than previous three-color categories while maintaining groups distinctive enough to reduce misclassification. These groups were converted into a numerical scale ranging from 0 (blue) to 6 (dark brown) and used as the outcome variable for linear regression–based GWAS analysis.

VisiGen consortium and study partners.
Participants

Specific recruitment criteria varied for each cohort (Supplementary Materials), with participants from the United Kingdom, The Netherlands, Italy, and European descendants in the United States and Australia.

Principal components analyses examined the genetic ancestry for each cohort and confirmed each was of nonadmixed European ancestry (specific details for each cohort are provided in the Supplementary Methods).

Phenotyping

Eye color categorization varied between cohorts, with most of the European cohorts adopting a 3- or 5-point categorical scale (details for each cohorts’ classification are provided in the Supplementary Methods).

Singaporean cohorts.
Participants

Full GWAS summary statistics were available for 959 participants of Chinese descent recruited at the Singapore Polyclinic and for 677 participants in the Singapore Indian Eye Study (SINDI).

Phenotyping

Phenotypes in these cohorts were ascertained using image analysis of digital photographs, better suited to capture variation within these populations (full details are provided in the Supplementary Methods).

Statistical analyses.
Association tests

We performed an independent linear regression for each cohort under the assumption of an additive model for allelic effects, with adjustments made for age, sex, and the first five principal components. Adjustment was made also for the genotyping platform (V1 to V4), and additional measures controlling for relatedness were implemented for the VisiGen family cohorts (full details for each cohort are provided in the Supplementary Methods). The use of a linear scale may be limited by the assumption that there is an equal difference between each group; however, as eye color phenotypically follows linear scale across the color spectrum, this choice of statistical model is appropriate and in accordance with previous studies on eye color (11, 27, 49).

Meta-analyses

The QC’ed summary statistics from the VisiGen and Singaporean cohorts were pooled into two separate meta-analyses: the first was conducted using data from the nine VisiGen European cohorts, and the second using data from the two Singaporean cohorts. Both meta-analyses were performed using METAL (52) applying a weighted Z-score approach, given the differing phenotypic scales used across individual cohorts, with genomic control adjustments (24) made for the VisiGen meta-analysis (see the Supplementary Methods). A final meta-analysis was then conducted between the 23andMe cohort and the European VisiGen cohorts following the same procedures with a weighted Z score approach and genomic control adjustments.

Definition of genomic region and significance

We adopted the customary definition for a GWAS significance threshold of 5 × 10−08. For this work, an “associated region” was a genomic region demarcated by consecutive significantly associated genomic markers, separated by a nonassociated region greater than 1 million base pairs.

LD score regression

LD Hub (53) was used to perform LD score regression (25) on the summary statistics from our GWASs to test for possible inflation. This method is more suited to our study than the Devlin (24) genomic inflation lambda because of high trait polygenicity and the large cohort sizes (25).

Conditional analysis

Conditional analysis was conducted using summary statistics with genome-wide complex trait analysis (GCTA) (54) following the same procedure described by Yang et al. (55). Genotypic data from unrelated participants in TwinsUK were used as a reference for LD structure, a distance of 1000 kb was set as an assumption of complete linkage equilibrium, and a collinearity threshold was set at R2 > 0.9.

Variant effect prediction

The predicted effects of variants were performed using the Ensembl Variant Effect Predictor (VEP) (37).

Heritability explained by the independent SNPs

Population liability scale heritability (56) explained by associated SNPs identified in the 23andMe discovery cohort was calculated using an unrelated sample from the TwinsUK cohort using restricted maximum likelihood analysis (54) under the assumption that the population prevalence (taken from the 23andMe cohort) is 30.3% for blue eyes and 38.3% for intermediate colors.

Pathway analysis

Genetic pathway analysis for iris color was performed with MAGENTA (57) using the 23andMe discovery SNP association results as input (full details are provided in the Supplementary Methods). Gene set definitions defined by Gene Ontology (58) were obtained from the Molecular Signatures Database (version MSigDB v6.1) (59) for this analysis.

Members of consortia and affiliations

23andMe research team. 23andMe Inc., Sunnyvale, California, USA—M. Agee, A. Auton, R. K. Bell, K. Bryc, S. L. Elson, P. Fontanillas, K. E. Huber, A. Kleinman, N. K. Litterman, M. H. McIntyre, J. L. Mountain, E. S. Noblin, C. A. M. Northover, S. J. Pitts, O. V. Sazonova, J. F. Shelton, S. Shringarpure, C. Tian, J. Y. Tung, and V. Vacic.

International Visible Trait Genetics Consortium. Department of Twins Research and Genetic Epidemiology, King’s College London, London, United Kingdom—P. G. Hysi and T. D. Spector.

Department of Genetic Identification, Erasmus MC University Medical Center Rotterdam, Rotterdam, The Netherlands—M. Kayser and F. Liu.

QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia—D. L. Duffy and N. G. Martin.

University of Queensland Diamantina Institute, University of Queensland, Brisbane, Queensland, Australia—D. M. Evans.

SUPPLEMENTARY MATERIALS

Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/7/11/eabd1239/DC1

https://creativecommons.org/licenses/by/4.0/

This is an open-access article distributed under the terms of the Creative Commons Attribution license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

REFERENCES AND NOTES

Acknowledgments: 23andMe—We thank the 23andMe research participants who made this study possible. ALSPAC—We are extremely grateful to all the families who took part in this study, the midwives for their help in recruiting them, and the whole ALSPAC team, which includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists, and nurses. We are grateful to all study participants and general practitioners for their contributions and to P. Veraart for help in genealogy, J. Vergeer for the supervision of the laboratory work, and P. Snijders for help in data collection, as well as K. Estrada, Y. Aulchenko, and C. Medina-Gomez for establishing the imputed dataset used. QIMR—We thank the twins and their families for their participation. We also thank K. McAloney (sample collection), L. Bowdler (DNA processing), and D. Smyth and H. Beeby (IT support). RAINE—We are grateful to the Raine Study participants and their families, and we thank the Raine Study and Lions Eye Institute research staff for cohort coordination and data collection. We thank all study participants and P. Arp, M. Jhamai, M. Verkerk, L. Herrera, M. Peters, C. Medina-Gomez, and F. Rivadeneira for help in creating the GWAS database, as well as K. Estrada, Y. Aulchenko, and C. Medina-Gomez for establishing the imputed dataset used. Singapore Polyclinic—We thank all the participants and researchers who contributed to our study. Singapore SINDI—We thank all the participants and researchers who contributed to our study. Funding: This work was supported by a Medical Research Council program grant (MC_UU_12013/4 to D.M.E). The U.K. Medical Research Council and the Wellcome Trust (grant refs.: 217065/Z/19/Z) and the University of Bristol provide core support for ALSPAC. D.M.E. was supported by an Australian Natural Medical Research Council Senior Research Fellowship (1137714). This publication is the work of the authors, and D.M.E. will serve as guarantor for the contents of this paper. GWAS data were generated by Sample Logistics and Genotyping Facilities at the Wellcome Trust Sanger Institute and LabCorp (Laboratory Corporation of America) using support from 23andMe. G.D.S. works in the Medical Research Council Integrative Epidemiology Unit at the University of Bristol (MC_UU_00011/1). ERF—The ERF Study was supported by the joint grant from the Netherlands Organization for Scientific Research (NWO, 91203014), the Center of Medical Systems Biology (CMSB), the Hersenstichting Nederland, the Internationale Stichting Alzheimer Onderzoek (ISAO), the Alzheimer Association project number 04516, the Hersenstichting Nederland project number 12F04(2).76, and the Interuniversity Attraction Poles (IUAP) program. As a part of EUROSPAN (European Special Populations Research Network), ERF was supported by the European Commission FP6 STRP grant number 018947 (LSHG-CT-2006-01947) and also received funding from the European Community’s Seventh Framework Program (FP7/2007-2013)/grant agreement HEALTH-F4-2007-201413 by the European Commission under the program “Quality of Life and Management of the Living Resources” of Fifth Framework Program (number QLG2-CT-2002-01254). High-throughput analysis of the ERF data was supported by joint grant from the Netherlands Organization for Scientific Research and the Russian Foundation for Basic Research (NWO-RFBR 047.017.043). Harvard HPFS—This research was supported by funds from the NIH UM1 CA167552 grant and the Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA. INGI-FVG—The research was supported by funds from Compagnia di San Paolo, Torino, Italy, and Fondazione Cariplo, Italy, and Ministry of Health, Ricerca Finalizzata 2008 and CCM 2010, and Telethon, Italy. INGI-VB—The research was supported by the Italian Ministry of Health (RF 2010 to PG), the FVG Region, and the Fondo Trieste. Funding was provided by the Australian National Health and Medical Research Council (241944, 339462, 389927, 389875, 389891, 389892, 389938, 442915, 442981, 496739, 552485, and 552498), the Australian Research Council (A7960034, A79906588, A79801419, DP0770096, DP0212016, and DP0343921), the FP-5 GenomEUtwin Project (QLG2-CT-2002-01254), and the U.S. NIH (NIH grants AA07535, AA10248, AA13320, AA13321, AA13326, AA14041, and MH66206). Statistical analyses were carried out on the Genetic Cluster Computer, which was financially supported by the Netherlands Scientific Organization (NWO 480-05-003). S.E.M. and D.L.D. were supported by the National Health and Medical Research Council (NHMRC) Fellowship Scheme. The core management of the Raine Study was funded by The University of Western Australia, Curtin University, Telethon Kids Institute, Women and Infants Research Foundation, Edith Cowan University, Murdoch University, The University of Notre Dame Australia, and the Raine Medical Research Foundation. The eye data collection of the Gen2-20 year follow-up of the Raine Study was funded by NHMRC grant 1021105, the Ophthalmic Research Institute of Australia (ORIA), the Alcon Research Institute, and the Lions Eye Institute and the Australian Foundation for the Prevention of Blindness. The NHMRC project number 1022134 (2012–2014) funded the serum 25(OH)D assays that were conducted by RDDT in Melbourne, Victoria, Australia. RS—The generation and management of GWAS genotype data for the Rotterdam Study were supported by the Netherlands Organization of Scientific Research NWO Investments (number 175.010.2005.011, 911-03-012). This study was funded by the Research Institute for Diseases in the Elderly (014-93-015; RIDE2), the Netherlands Genomics Initiative (NGI)/Netherlands Organization for Scientific Research (NWO) project number 050-060-810. The Rotterdam Study was supported by the Erasmus MC and Erasmus University Rotterdam; the Netherlands Organization for Scientific Research (NWO); the Netherlands Organization for Health Research and Development (ZonMw); the Research Institute for Diseases in the Elderly (RIDE); the Netherlands Genomics Initiative (NGI); the Ministry of Education, Culture and Science; the Ministry of Health Welfare and Sport; the European Commission (DG XII); and the Municipality of Rotterdam. The generation and management of GWAS genotype data for the Rotterdam Study were executed by the Human Genotyping Facility of the Genetic Laboratory of the Department of Internal Medicine, Erasmus MC. TwinsUK—The TwinsUK study was funded by the Wellcome Trust and the European Community’s Seventh Framework Programme (FP7/2007–2013). The study also received support from the Fight for Sight and the National Institute for Health Research (NIHR)–funded BioResource, Clinical Research Facility, and Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust in partnership with the King’s College London. SNP genotyping was performed by The Wellcome Trust Sanger Institute and National Eye Institute via NIH/CIDR. Author contributions: P.G.H., A.V., M.K., and T.D.S. designed the study. M.S. conducted the meta-analyses and follow-up analyses. M.S., P.G.H., C.J.H., M.K., and T.D.S. wrote and prepared the manuscript. All authors were involved in the GWAS analyses for their respective cohorts and in reviewing/editing the manuscript. Competing interests: 23andMe is a for-profit genomics company. M.K. is a coinventor of a patent related to this work filed by the Erasmus University Medical Center Rotterdam (EP2195448A1, “Method to predict iris color”) but receives no license fees or royalties from this. The other authors declare that they have no competing interests. D.A.H. is an employee and holds stock options in 23andMe Inc. Data and materials availability: All data needed to evaluate the conclusions in paper are present in the paper and/or the Supplementary Materials. Summary statistics from the cohorts that participated in the meta-analysis are available from the GWAS Catalog public repository (www.ebi.ac.uk/gwas/downloads/summary-statistics). These freely downloadable summary statistics are calculated using all cohorts described in this manuscript, except for the 23andMe participants. This is due to a nonnegotiable clause in the 23andMe data transfer agreement, intended to protect the privacy of the 23andMe research participants. However, these data may be obtained by making a request to the 23andMe research collaboration portal (https://research.23andme.com/dataset-access/). Additional data related to this paper may be requested from the authors.

Stay Connected to Science Advances

Navigate This Article