Research ArticleANTHROPOLOGY

Ancient DNA sheds light on the genetic origins of early Iron Age Philistines

See allHide authors and affiliations

Science Advances  03 Jul 2019:
Vol. 5, no. 7, eaax0061
DOI: 10.1126/sciadv.aax0061


The ancient Mediterranean port city of Ashkelon, identified as “Philistine” during the Iron Age, underwent a marked cultural change between the Late Bronze and the early Iron Age. It has been long debated whether this change was driven by a substantial movement of people, possibly linked to a larger migration of the so-called “Sea Peoples.” Here, we report genome-wide data of 10 Bronze and Iron Age individuals from Ashkelon. We find that the early Iron Age population was genetically distinct due to a European-related admixture. This genetic signal is no longer detectible in the later Iron Age population. Our results support that a migration event occurred during the Bronze to Iron Age transition in Ashkelon but did not leave a long-lasting genetic signature.


Within the history of the Eastern Mediterranean, the end of the Bronze Age and the beginning of the Iron Age were marked by exceptional cultural disarray that followed the demise of prosperous economies and cultures in Greece, Egypt, the Levant, and Anatolia (1). During the 12th century BCE, coincident with these events, substantive cultural changes appeared in the archeological record of Ashkelon, Ashdod, and Ekron, three of the five core cities mentioned as “Philistine” in the Hebrew bible (24). These settlements were distinct from neighboring sites in architectural traditions and material culture (24). Resemblances between the new cultural traits and 13th century patterns found in the Aegean have led some scholars to explain this so-called “Philistine phenomenon” by a migration from an Aegean-related source, potentially associated with the “Sea Peoples,” a population that is thought to have settled in other parts of the coastal Eastern Mediterranean (2, 3). This hypothesis has been challenged by those arguing that this cultural change was driven by a diffusion of knowledge or internal development of ideas (58) rather than by a large-scale movement of people. Even for those who do accept the idea of large-scale mobility, the homeland of the new arrivals is disputed with suggested alternatives including Cyprus or Cilicia (4), a mixture of non-Aegean east Mediterranean peoples (8), and mixed heterogeneous maritime groups, akin to pirates (9). Proposed links go as far as northern Italy where depopulation events have been suggested to trigger population movements throughout the Mediterranean (10).

Recent ancient DNA (aDNA) studies have reported a high degree of genetic continuity in the Levant during the late Pleistocene and early Holocene that was followed by increasing population admixtures with Anatolian- and Iranian-related populations at least up to the Middle Bronze Age (1114). Genome-wide data from Late Bronze and Iron Age populations have, so far, not been available for this region.

Here, we report genome-wide data from human remains excavated at the ancient seaport of Ashkelon, forming a genetic time series encompassing the Bronze to Iron Age transition (Fig. 1, A and B). We find that all three Ashkelon populations derive most of their ancestry from the local Levantine gene pool. The early Iron Age population was distinct in its high genetic affinity to European-derived populations and in the high variation of that affinity, suggesting that a gene flow from a European-related gene pool entered Ashkelon either at the end of the Bronze Age or at the beginning of the Iron Age. Of the available contemporaneous populations, we model the southern European gene pool as the best proxy for this incoming gene flow. Last, we observe that the excess European affinity of the early Iron Age individuals does not persist in the later Iron Age population, suggesting that it had a limited genetic impact on the long-term population structure of the people in Ashkelon.

Fig. 1 Overview of locations and ages of analyzed individuals.

(A) Locations of newly reported and other selected published genomes. The newly reported Ashkelon populations are annotated in the upper corner. (B) The range of average ages of the ancient groups is given in thousands of years (ka) BCE.


Genome-wide data across the Bronze/Iron Age transition in Ashkelon

We extracted and sequenced DNA from 108 skeletal elements excavated in Ashkelon. In line with the low DNA preservation previously reported for the southern Levant (1114), only 10 yielded sufficient amounts of human DNA (data file S1). Sequencing libraries for these 10 individuals were enriched for human DNA using an established in-solution DNA capture targeting 1.24 million genome-wide single-nucleotide polymorphisms (SNPs) (“1240 k capture”) (1517). The earliest three individuals were excavated from a Bronze Age necropolis (1820) and dated by the archeological context to the Middle Bronze IIC–Late Bronze Age II (labeled ASH_LBA; two of the individuals were directly carbon-dated to 1746 to 1542 cal BCE). Four early Iron Age infants were recovered from burials beneath late 12th century Iron Age I houses (labeled ASH_IA1; directly carbon-dated to 1379 to 1126 cal BCE), and three later Iron Age individuals were recovered from a cemetery adjacent to the city wall of ancient Ashkelon, which was estimated to have been used between the 10th and the 9th century BCE (labeled ASH_IA2; one individual directly carbon-dated to 1257 to 1042 cal BCE) (Fig. 1, A and B, Table 1, table S1, and text S1). While the chronological ranges estimated by the archeological context are approximately within the range of the radiocarbon assays conducted directly on the petrous bones sampled for DNA (table S1), we caution that they should be taken as a rough estimate, since the presence of marine carbon in the diet could make the calibrated radiocarbon dates older than the true age.

Table 1 An overview of the Ashkelon genomes reported in this study.

For each individual, the analysis group is given (ASH_LBA, Ashkelon Late Bronze Age; ASH_IA1, Ashkelon Iron Age 1; ASH_IA2, Ashkelon Iron Age 2). 14C dating results are given in cal BCE in two-sigma range (NA, not available). Detailed dating information is provided in text S1 and table S1. The proportion of human DNA and the mean coverage on 1240 K target sites in the “1240 K” enriched libraries are given. The assigned genetic sex is listed (F, Female; M, Male). Uniparental haplogroups (mt, mitochondrial; Ychr, Y chromosome) are listed.

View this table:

To control for the quality of the dataset, we estimated exogenous DNA levels and relatedness. Mitochondrial contamination for all 10 individuals ranged between 0 and 9%, with only two of them above 5% (table S2). For males (n = 4), nuclear contamination was estimated to a maximum of 4.3% (table S2). None of the individuals were either first- or second-degree related to each other (fig. S1). We performed population genetic analysis on a merged dataset, including the genotype data of the newly reported individuals from this study and previously published datasets from 638 ancient individuals and 4943 individuals belonging to 298 present-day populations. (11, 12, 21). (An overview of the key ancient individuals analyzed is given in data file S2).

Persistence of the local gene pool in the Bronze Age Levant

To understand the genetic profile of the ancient Ashkelon individuals, we began by projecting them onto the first two axes of variation (PC1 and PC2) of present-day west Eurasians, inferred from principal component analysis (PCA) (Fig. 2A and fig. S2) (12). The ASH_LBA individuals overlap with the cline of present-day Near Easterners and are close to earlier Bronze Age Levantines and Anatolians [Early Bronze Age individuals from ‘Ain-Ghazal, Jordan labeled “Jordan_EBA” (12); Middle Bronze Age individuals from Sidon, Lebanon labeled “Lebanon_MBA” (13); and Early to Late Bronze Age individuals from Central Anatolia labeled “Anatolia_EBA” (21) and “Anatolia_MLBA” (21), respectively]. Compared to earlier Levantines [Levantine early farmers from present-day Jordan and Israel labeled “Levant_N” (12) and Chalcolithic individuals from Peqi’in, Israel labeled “Levant_ChL” (11)], the Bronze Age individuals including ASH_LBA are all shifted along PC2 toward ancient Iranian and Caucasus individuals [e.g., the Caucasus Mesolithic hunter-gatherers labeled “CHG” (22); early farmers excavated from present-day Iran labeled “Iran_N” (12); and the Chalcolithic individuals excavated from present-day western Iran labeled “Iran_ChL” (12)], in agreement with previous observations (1113). Unsupervised genetic clustering using ADMIXTURE (23) shows a similar pattern [with K = 9 clusters; Fig. 2B and fig. S3]; ASH_LBA is assigned the same major ancestral component as all earlier Levantine populations (shown in orange). Consistent with their PCA positions, a second component (shown in green) that is maximized in Iran_N is, on average, higher in ASH_LBA and in each of the earlier Bronze Age Levantines compared to all earlier Levantines (20 to 30% and 3 to 8%, respectively).

Fig. 2 PCA and ADMIXTURE analysis.

(A) Ancient genomes (marked with color-filled symbols) projected onto the principal components inferred from present-day west Eurasians (gray circles) (fig. S2). The newly reported Ashkelon populations are annotated in the upper corner. (B) ADMIXTURE analysis. A selected set of ancient individuals (as well as present-day Sardinians) is plotted (K = 9 was chosen since it is the cluster number that maximizes components correlated to the most differentiated populations in the west Eurasian PCA).

To formally test the qualitative observations based on the PCA and ADMIXTURE analyses, we used f4-statistics (24). Consistent with the ASH_LBA positions on PC2, the f4-statistic of the form f4 (ASH_LBA, Levant_N/Levant_ChL; test, Mbuti) (figs. S4 and S5) estimated excess allele sharing between ASH_LBA and ancient Iranian/Caucasus-related populations (such as CHG, Iran_N, and Iran_ChL) compared to the Neolithic/Chalcolithic Levantines (Levant_N and Levant_ChL), confirming previous reports of post-Neolithic gene flows from prehistoric populations related to Iran or the Caucasus into the Levant (1113). Accordingly, modeling ASH_LBA as a two-way admixture using qpAdm (12, 17) produced a fitting model (χ2P = 0.445), in which ASH_LBA derives around 60% of their ancestry from Levant_ChL (60.0 ± 6.5%; estimate ±1 SE) and the rest from Iran_ChL. The spatiotemporal origins of this post-Neolithic gene flow remain broadly defined since alternative two-way models also fit, albeit with smaller P values. In these models, Anatolia_EBA/Anatolia_BA/Armenia_ChL are used as the second source, replacing Iran_ChL (χ2P = 0.053 to 0.060; table S3).

We then examined whether there are noticeable genetic differences between ASH_LBA and the earlier Bronze Age Levant populations by measuring f4 (ASH_LBA, Jordan_EBA/Lebanon_MBA; test, Mbuti (figs. S6 and S7). We find that ASH_LBA and both earlier Bronze Age populations are symmetrically related to all test populations within our data’s resolution (|f4| < 3 SE). However, we observe a marginal excess of affinity between ASH_LBA and populations genetically related to ancient Iran and the Caucasus (such as CHG and Steppe_Eneolithic) when compared to Jordan_EBA (fig. S7). Within our current resolution, we cannot distinguish whether this affinity is due to a small-scale gene flow from an Iranian/Caucasus-related population entering the Levant between the Early and Late Bronze Age or whether it reflects a certain population structure during this period. Nonetheless, the apparent symmetry suggests a high degree of genetic continuity throughout at least a millennium between culturally and geographically distinct Bronze Age Levantine groups occupying a region that stretches from the inland southern Levant where present-day Jordan is located and along the coastal regions of present-day Israel and Lebanon.

Genetic discontinuity between the Bronze Age and the early Iron Age people of Ashkelon

In comparison to ASH_LBA, the four ASH_IA1 individuals from the following Iron Age I period are, on average, shifted along PC1 toward the European cline and are more spread out along PC1, overlapping with ASH_LBA on one extreme and with the Greek Late Bronze Age “S_Greece_LBA” (25) on the other (Fig. 2A). Similarly, genetic clustering assigns ASH_IA1 with an average of 14% contribution from a cluster maximized in the Mesolithic European hunter-gatherers labeled “WHG” (shown in blue in Fig. 2B) (15, 22, 26). This component is inferred only in small proportions in earlier Bronze Age Levantine populations (2 to 9%). In light of these observations, we formally tested the variation within each of the three Ashkelon populations by computing the variance of the square rooted statistic f4 (Ashkelon individual 1, Ashkelon individual 2; test, Mbuti) (fig. S8). Consistent with the wide spread of ASH_IA1 in the west Eurasian PCA, the variance is significantly higher in ASH_IA1 than in either ASH_LBA or ASH_IA2 (P < 2.2 × 10−16 for both by a Fligner-Killeen test) (table S4 and fig. S8), suggesting that the ASH_IA1 individuals were more heterogeneous in their genetic affinities to global populations in respect to both ASH_LBA and ASH_IA2.

We next formally quantified the genetic differences between ASH_IA1 and the earlier ASH_LBA by calculating f4 (ASH_IA1, ASH_LBA; test, Mbuti) (Fig. 3A). In agreement with the PCA and ADMIXTURE results, only European hunter-gatherers (including WHG) and populations sharing a history of genetic admixture with European hunter-gatherers (e.g., as European Neolithic and post-Neolithic populations) produced significantly positive f4-statistics (Z ≥ 3), suggesting that, compared to ASH_LBA, ASH_IA1 has additional European-related ancestry. To estimate the levels of the European-related ancestry in all Ashkelon populations, we compared qpAdm models of two-way mixtures (Levant_ChL and Iran_ChL; i.e., 0% contribution from WHG) to three-way ones, in which we add WHG as the third source (Fig. 3C and table S5). ASH_LBA and ASH_IA2 fit well with the two-way model (χ2P = 0.445 and χ2P = 0.313, respectively; table S5), whereas the three-way one infers small nonsignificant proportions (−2.3 ± 2.2 and 2.0 ± 2.2% for ASH_LBA and ASH_IA2, respectively; table S5). In contrast, for ASH_IA1, the two-way model is inadequate (χ2P = 3.80 × 106; table S5) and the three-way one fits (χ2P = 0.765). Thus, of the three, additional WHG-like ancestry is only necessary to model ASH_IA1.

Fig. 3 European-related admixture detected in ASH_IA1.

(A) ASH_IA1 shares access affinity with European-related populations compared to ASH_LBA. We plot the top and bottom 40 values of f4 (ASH_IA1, ASH_IA2; X, Mbuti) on the map. Circles mark the ancient populations and triangles the present-day ones. Z-scores calculated by 5-centimorgan block jackknifing are represented by the size of the symbols. “X” share more alleles with ASH_IA1 when values are positive and with ASH_IA2 when negative. The five groups with the most positive values are annotated on the map (Z > 2.3). (B) We plot the ancestral proportions of the Ashkelon individuals inferred by qpAdm using Iran_ChL, Levant_ChL, and WHG as sources ±1 SEs. P values are annotated under each model. In cases when the three-way model failed (χ2P < 0.05), we plot the fitting two-way model. The WHG ancestry is necessary only in ASH_IA1.

We find that the PC1 coordinates positively correlate with the proportion of WHG ancestry modeled in the Ashkelon individuals (fig. S9 and table S6), suggesting that WHG reasonably tag a European-related ancestral component within the ASH_IA1 individuals. However, these Mesolithic individuals are unlikely to be a good proxy for the true source in the much later early Iron Age. To examine more proximate sources, we compiled a set of chronologically and geographically relevant candidate populations, including populations that shared higher affinity with ASH_IA1 compared to ASH_LBA in the above f4-statistic. Subsequently, we modeled ASH_IA1 as two-way and three-way mixtures of ASH_LBA and combinations of the candidate populations (table S7). Of the 51 tested models, we find four plausible ones (χ2P > 0.05), all are two-way mixtures. The best supported one (χ2P = 0.675) infers that ASH_IA1 derives around 43% of ancestry from the Greek Bronze Age “Crete_Odigitria_BA” (43.1 ± 19.2%) and the rest from the ASH_LBA population. ASH_IA1 could also be modeled with either the modern “Sardinian” (35.2 ± 17.4%; χ2P = 0.070), the Bronze Age “Iberia_BA” (21.8 ± 21.1%; χ2P = 0.205), or the Bronze Age “Steppe_MLBA” (15.7 ± 9.1%; χ2P = 0.050) as the second source population to ASH_LBA. To check whether these results are due to the low coverage of ASH_LBA, we repeated this analysis, but this time, we modeled ASH_IA1 as a three-way mixture of each of the candidate populations, Levant_ChL and Iran_ChL. The two latter populations have higher genome coverage and can model ASH_LBA well in combination (table S3). In this analysis, only the models including “Sardinian,” “Crete_Odigitria_BA,” or “Iberia_BA” as the candidate population provided a good fit (χ2P = 0.715, 49.3 ± 8.5%; χ2P = 0.972, 38.0 ± 22.0%; and χ2P = 0.964, 25.8 ± 9.3%, respectively). We note that, because of geographical and temporal sampling gaps, populations that potentially contributed the “European-related” admixture in ASH_IA1 could be missing from the dataset. Therefore, better proxies might be found in the future when more data is available. Nonetheless, the tested candidate populations from Anatolia, Egypt, and the Levant that did not produce well-fitting models can be excluded as potential sources of the admixture observed in ASH_IA1.

The transient impact of the “European-related” gene flow on the Ashkelon gene pool

The ASH_IA2 individuals are intermediate along PC1 between the ASH_LBA ones and the earlier Bronze Age Levantines (Jordan_EBA/Lebanon_MBA) in the west Eurasian PCA (Fig. 2A). Notably, despite being chronologically closer to ASH_IA1, the ASH_IA2 individuals position closer, on average, to the earlier Bronze Age individuals. Such a reduced affinity to the European populations is also apparent in the genetic clustering results, showing that the European hunter-gatherer–related ancestry contributes considerably less to ASH_IA2 than to ASH_IA1 (8 and 14%, respectively; Fig. 2B). To test for differences in affinities between ASH_IA2 and ASH_IA1, we measured the f4-statistic of the form f4 (ASH_IA2, ASH_IA1; test, Mbuti). This statistic produced no significantly positive results, and all significantly negative statistics (Z ≤ −3) are with test populations from either Europe or Anatolia (fig. S10). This, again, highlights the increased European-related affinity of ASH_IA1.

The transient excess of European-related genetic affinity in ASH_IA1 can be explained by two scenarios. The early Iron Age European-related genetic component could have been diluted by either the local Ashkelon population to the undetectable level at the time of the later Iron Age individuals or by a gene flow from a population outside of Ashkelon introduced during the final stages of the early Iron Age or the beginning of the later Iron Age. Considering the symmetry measured between ASH_IA2 and ASH_LBA in the f4-statistic of the form f4 (ASH_IA2, ASH_LBA; test, Mbuti) (|Z| < 2.8; fig. S11), the replacing population is likely to have stemmed from the Late Bronze Age Levantine gene pool, whether or not from Ashkelon. By modeling ASH_IA2 as a mixture of ASH_IA1 and earlier Bronze Age Levantines/Late Period Egyptian, we infer a range of 7 to 38% of contribution from ASH_IA1, although no contribution cannot be rejected because of the limited resolution to differentiate between Bronze Age and early Iron Age ancestries in this model (table S8).


By investigating genome-wide data from Ashkelon, we address long-pending historical questions regarding the demographic developments underlying the Late Bronze Age to Iron Age cultural transformation. On a larger regional scale, these data begin to fill a temporal gap in the genetic map of the southern Levant, revealing persistence of the local Levantine gene pool throughout the Bronze Age for over a millennium. At the same time, by the “zoomed-in” comparative analysis of the Ashkelon genetic time transect, we find that the unique cultural features in the early Iron Age are mirrored by the distinct genetic composition we detect in ASH_IA1. Our analysis suggests that this genetic distinction is due to a European-related gene flow introduced in Ashkelon during either the end of the Bronze Age or the beginning of the Iron Age. This timing is in accord with estimates of the Philistines arrival to the coast of the Levant, based on archeological and textual records (24). We find that, within no more than two centuries, this genetic footprint introduced during the early Iron Age is no longer detectable and seems to be diluted by a local Levantine-related gene pool.

The relatively rapid disappearance of this signal stresses the value of temporally dense genetic sampling for addressing historical questions. Transient gene flows, such as the one detected here, might be overlooked because of a lack of representative samples, potentially leading to erroneous conclusions. In geographic regions unfavorable to DNA preservation, obtaining such datasets requires exhaustive sampling and the utilization and further development of advanced technologies such as DNA enrichment techniques (1517) and targeted sampling strategies (27).

We do not rule out that some gene flow occurred during the Bronze Age as low significance of the f4-statistics might be due to the limited statistical power of our data stemming from either insufficient coverage or a lack of appropriate contemporaneous proxy populations. Thus, additional sampling is needed to further investigate the question of the genetic diversity within the Levantine Bronze Age populations and to characterize the spatiotemporal extent of potential incoming gene flows. Similarly, a larger sample size might help to accurately infer the extent and magnitude of the early Iron Age gene flows and to identify more precisely the populations introducing the European-related component to Ashkelon. While our modeling suggests a southern European gene pool as a plausible source, future sampling in regions such as Cyprus, Sardinia, and the Aegean, as well as in the southern Levant, could better resolve this question.


Extraction of aDNA and preparation for next-generation sequencing

One hundred eight human skeletal elements excavated at the archeological site of Ashkelon, Israel by the Leon Levy Expedition during 1997 to 2016 were sampled and screened for the presence of human aDNA (28 from the Bronze Age, 8 from the Iron Age I, and 72 from the Iron Age II). All preamplification procedures were performed in the dedicated aDNA facilities of the Max Planck Institute for the Science of Human History (MPI-SHH), Jena, Germany. Sampling of the petrous parts of temporal bones was done by drilling the inner ear part (27). Teeth were sampled by drilling the dental pulp. DNA was extracted from around 100 mg of pulverized bone following an established protocol (28). A double-stranded and dual-indexed Illumina DNA library was prepared from a 20-μl aliquot of each extract using published protocols (29, 30). To reduce potential errors in the obtained DNA sequences, we partially removed DNA damage resulting from cytosine deamination using uracil-DNA glycosylase and endonuclease VIII as previously reported (31). In this procedure, the damage is retained in the terminal positions of the DNA fragments and can be leveraged as a measure of aDNA authentication. Amplification of the indexed libraries was carried out using Herculase II Fusion DNA polymerase as in the manufacturer’s protocol. All libraries were directly shotgun single-end sequenced on an Illumina HiSeq 4000 platform (1 × 75 + 8 + 8 cycles).

DNA enrichment

A subset of libraries that exceeded 0.2% human DNA and 10% deamination damage at the terminal end of DNA fragments were subsequently used for two previously published hybridization-based in-solution DNA-enrichment assays: (i) The 1240 k capture (1517, 32), which targets 1,237,207 genome-wide nuclear SNPs. The targeted SNP panel combines the sets reported by Haak et al. (17) and by Fu et al. (16) and is further described by Mathieson et al. (15); and (ii) The “Mitochondrial capture,” which targets the whole human mitochondrial genome (16). Both captures were carried out as described in the SI (supplemental information) text sections 3.2-3.3 of Fu et al. (16) with modified hybridization conditions of 65°C for about 24 hours. Enriched libraries were single-end sequenced on the same platform as the initial shotgun ones.

Postsequencing data processing

The binning of the sequenced reads (demultiplexing) allowed a maximum of one mismatch in each index. The demultiplexed libraries were processed and mapped using the EAGER (v 1.92.54) pipeline (33). The adapter sequences were clipped, and reads were filtered for the ones longer than 30 base pairs using AdapterRemoval (v2.2.0) (34). Adapter-clipped reads were aligned to the UCSC genome browser human genome reference hg19 using BWA aln/samse alignment software (v0.7.12) (35) with a lenient stringency parameter (“-n 0.01”) and by retaining only reads with Phred-scaled mapping quality scores ≥ 30. Duplicate reads were removed with DeDup v0.12.2 (33). To eliminate genotyping bases with retained damage, the two terminal positions in each read were clipped, and subsequently, a pseudo-diploid genotype was reconstructed for each individual using pileupCaller by performing a random choice between high-quality bases (Phred-scaled base quality score ≥ 30) aligning to each targeted SNP position (


The newly generated ancient data were merged with a previously described dataset (12) and with additional published datasets (11, 1314, 21, 25, 36). The merged dataset includes 638 published ancient genomes (1114, 21, 25, 36) and 4943 ones belonging to 298 worldwide present-day populations (12, 37) that were genotyped on the Affymetrix Axiom Genome-Wide Human Origins 1 array (38) (“HO dataset”) with a total of 593,124 SNP sites in the merged dataset.

aDNA authentication and quality control

We used multiple measures to authenticate the generated aDNA data. (i) To control for potential laboratory contamination, blank extractions and library preparations were included and analyzed for each sample batch (data file S3). (ii) Levels of DNA deamination damage in the mapped reads were estimated using mapDamage (v2.0) (39) and compared to the expected values in ancient skeletal element. (iii) We measured human mitochondrial DNA contamination using schmutzi (40). (iv) The genetic sex was inferred for each individual by calculating the ratio of the X and Y chromosome coverage (normalized by the autosomal average coverage). We next used ANGSD (v0.910) (41), which compares mismatch rates between polymorphic sites on the X chromosome to estimate nuclear contamination in males.

To avoid bias associated with including related individuals into analyzed populations, we calculated the pairwise allele mismatch rates of all newly reported individuals following a method described elsewhere (fig. S1) (42).

Principal component analysis

We constructed the principal components of 55 present-day west Eurasian groups and projected the ancient individuals onto the first two components (fig. S2) using the smartpca software from the EIGENSOFT package (v6.0.1) (43) with the lsqproject option.

ADMIXTURE analysis

We carried out a maximum likelihood unsupervised clustering of 5581 ancient and present-day individuals using ADMIXTURE (v1.3.0) (23) with a cluster number (k) ranging between 2 and 20. We used the --indep-pairwise option in PLINK (v1.90) (44, 45) to prune for linkage disequilibrium by identifying SNP pairs with genotype r2 ≥ 0.2 within a 200-SNP sliding window (advancing by 25 SNPs each time) and retaining only one randomly chosen SNP (--indep-pairwise 200 25 0.2). We performed five replicates for each k with random seeds and chose the highest likelihood replicate (fig. S3). Fivefold cross-validation (CV) errors were calculated for each run.


To test for gene flow between populations, we estimated their respective allele frequency correlations using f-statistics. Both f4- and f3-statistics were computed using the qpDstat program (v701) of the ADMIXTOOLS package (v4.1) (24) with default parameters.

To test the symmetry between populations X and Y, the f4-statistic of the form f4 (X, Y; test, outgroup) was used. We report the f-statistics in which at least 20,000 SNP positions were overlapping between the four tested populations.

Modeling ancestry proportions

We used the qpWave (v400) and qpAdm (v632) programs of ADMIXTOOLS (12, 17) to test and model admixture proportions from potential source populations (reference populations) in the tested populations. We used a basic set of seven outgroups including present-day populations (Han, Onge, Mbuti, Mala, Mixe) (37) that represent a global genetic variation and published ancient populations such as Natufian (12), which represents a Levantine gene pool outside of modern genetic variation and the European Upper Palaeolithic individual “Villabruna” (36).

The distribution of the squared f4-statistics of the form f4 (Ashkelon individual 1, Ashkelon individual 2; test, Mbuti) was estimated for χ2 distribution with 1 df using Q-Q plot for each analyzed group (ASH_LBA, ASH_IA1, and ASH_IA2) (fig. S8). The variance for each group was estimated by a Fligner-Killeen test, which is robust against departures from normality that can be caused by small sample size.

Mitochondrial DNA analysis

We constructed the whole mitochondrial genome consensus sequence for each individual using the log2fasta program of schmutzi (40) with a quality cutoff of 10. The consensus sequences were inputted to HaploFind (46) and HaploGrep (47) for mitochondrial haplogroup assignment.

Y-chromosome analysis

We assigned the Y-chromosome haplogroup for each male individual using yHaplo (48). The genotyping was carried out by a random draw of a single-base mapping to one of the 13,508 International Society of Genetic Genealogy (ISOGG) consortium SNP positions. Strand-ambiguous SNPs were discarded, and quality filters were as those for the HO dataset. In addition, the automatically called derived alleles, which support the haplogroup assignment, were manually confirmed.

Phenotypic traits analyses

We inspected the allele distribution in SNP positions associated with a selection of biological traits, including lactase persistence (49, 50), Malaria resistance (51, 52), glucose-6-phosphate dehydrogenase deficiency (53, 54), and skin pigmentation (26, 5556). The allele distribution for the SNP positions listed in table S9 was tabulated for each individual using SAMtools mpileup (v1.3).

Carbon dating

The petrous parts of the temporal bone of individuals ASH029, ASH033, ASH2-3, ASH066, ASH067, ASH068, and ASH008 were each sampled and directly radiocarbon-dated at the Curt-Engelhorn-Zentrum Archaeometry gGmbH, Mannheim, Germany (table S1). Collagen was extracted from the bone samples, purified by ultrafiltration (>30 kDa fraction), freeze-dried, and combusted to CO2 in an elemental analyzer. CO2 was converted catalytically to graphite. The dating was performed using the Mini radiocarbon dating system - accelerator mass spectrometry (MICADAS-AMS) of the Klaus-Tschira-Archäometrie-Zentrum. The resulting 14C ages were normalized to d13C = −25% (57) and calibrated using the dataset INTCAL13 (58) and the software SwissCal 1.0 (59).

Correction (26 November 2019): Figures 1-2 and supplementary tables S7-S8 were updated to correctly represent an individual formerly labeled as “Egypt_New_Kingdom”.


Supplementary material for this article is available at

Text S1. Description of individuals excavated in Ashkelon and archeological information

Fig. S1. Pairwise mismatch rate.

Fig. S2. PCA plot of present-day west Eurasian populations.

Fig. S3. ADMIXTURE CV errors for each cluster number (K).

Fig. S4. The Bronze Age Ashkelon population shares a higher genetic affinity with Caucasus/Iranian-related populations when compared to the Neolithic Levantines.

Fig. S5. The Bronze Age Ashkelon population shares a marginal higher genetic affinity with populations related to ancient Caucasus/Iran when compared to Chalcolithic Levantines.

Fig. S6. The Bronze Age Ashkelon population is symmetrically related to Lebanon_MBA.

Fig. S7. The Bronze Age Ashkelon population is mostly symmetrically related to Jordan_EBA but shares a slightly higher genetic affinity with populations related to ancient Caucasus/Iran.

Fig. S8. Variation in differential affinities between the Ashkelon groups.

Fig. S9. The proportion of Mesolithic European WHG ancestry in ASH_IA1 individuals correlates with their position in the west Eurasian PCA.

Fig. S10. ASH_IA1 shares more alleles with European-related populations compared to ASH_IA2.

Fig. S11. ASH_IA2 is symmetrically related to ASH_LBA.

Fig. S12. Map of the Bronze Age necropolis.

Fig. S13. Relative stratigraphy of grid 38.

Fig. S14. Relative stratigraphy of N5. Square 14.

Fig. S15. Relative stratigraphy of N5. Square 24.

Table S1. 14C radiocarbon dating performed for this study.

Table S2. Nuclear and mitochondrial contamination estimates.

Table S3. qpADM admixture models of the ASH_LBA group.

Table S4. Fligner-Killeen test based on [f4 (Ashkelon ind 1, Ashkelon ind 2; test, Mbuti)]2 statistics.

Table S5. Ashkelon individuals and groups modeled using qpADM by “Levant_ChL,” “Iran_ChL,” and WHG as sources.

Table S6. Bronze Age and Iron Age individuals and groups modeled by the three maximized populations on the west Eurasian PCA (using qpADM).

Table S7. qpADM admixture models of the ASH_IA1 group.

Table S8. qpADM admixture models of the ASH_IA2 group.

Table S9. Phenotypic analysis.

Table S10. Archeological context.

Data file S1. An overview of skeletal material screened for aDNA in this study.

Data file S2. An overview of the main analysis groups used in this study.

Data file S3. Sequencing statistics of negative controls.

References (2866)

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank the late L.E. Stager who codirected the excavations in Ashkelon and has devoted much of his scientific carrier to understanding the origins of the Philistines. We thank M. McCormick for initial conversations with the late L.E. Stager that sparked this investigation. We thank G. Brandt, A. Wissgott, F. Aron, C. Freund, R. Stahl, and I. Kucukkalipci (MPI-SHH) for support in laboratory work. We thank A. Mötsch and S. Eisenmann for support in organization and sample management. We thank M. Oreilly for graphic support. We thank M. Faerman, N. L. Shattah, P. Smith, S. Fox, R. Kalisher, and K. Marklein for the anthropological identification of skeletal remains used in this study. We thank the members of the population genetics group and the Max Planck–Harvard Research center for the Archaeoscience of the ancient Mediterranean (MHAAM) group in the Department of Archaeogenetics, MPI-SHH for their input and support. Funding: This work was supported by the Max Planck Society and the MHAAM. Excavations at Ashkelon were funded by grants from L. Levy, S. White, and the Leon Levy Foundation under license from the Israel Antiquity Authority and the Israel Nature and Parks Authority. Author contributions: J.K. and D.M.M. conceived the study. J.K. supervised the genetic work. D.M.M. and A.J.A. provided archeological material. D.M.M., P.W.S., and A.J.A. advised on the archeological background and interpretation. D.M.M. and A.J.A. wrote the archeological and sample background section. M.F., M.B., A.M., and R.A.B. performed the laboratory work. M.F. performed the data analyses, with C.J. providing guidance. M.F., D.M.M., C.J., and J.K. wrote the manuscript with input from all coauthors. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and the Supplementary Materials. The genomic alignment data (BAM format) is available through the European Nucleotide Archive (ENA) under the accession number (Study PRJEB31035), and the Eigenstrat format 1240 K pulldown genotype data for ancient individuals newly reported in this study are available at the Edmond data repository of the Max Planck Society ( The Ashkelon Iron Age skeletal material has been accessioned to the National Natural History Collections (NNHC) of the Hebrew University.

Stay Connected to Science Advances

Navigate This Article