Research ArticleANTHROPOLOGY

The genetic prehistory of the Andean highlands 7000 years BP though European contact

See allHide authors and affiliations

Science Advances  08 Nov 2018:
Vol. 4, no. 11, eaau4921
DOI: 10.1126/sciadv.aau4921


The peopling of the Andean highlands above 2500 m in elevation was a complex process that included cultural, biological, and genetic adaptations. Here, we present a time series of ancient whole genomes from the Andes of Peru, dating back to 7000 calendar years before the present (BP), and compare them to 42 new genome-wide genetic variation datasets from both highland and lowland populations. We infer three significant features: a split between low- and high-elevation populations that occurred between 9200 and 8200 BP; a population collapse after European contact that is significantly more severe in South American lowlanders than in highland populations; and evidence for positive selection at genetic loci related to starch digestion and plausibly pathogen resistance after European contact. We do not find selective sweep signals related to known components of the human hypoxia response, which may suggest more complex modes of genetic adaptation to high altitude.


The Andean highlands of South America have long been considered a natural laboratory for the study of genetic adaptation of humans (1), yet the genetics of Andean highland populations remain poorly understood. People likely entered the highlands shortly after their arrival on the continent (2, 3), and while some have argued that humans lived permanently in the Central Andean highlands by 12,000 before the present (BP) (4), other research indicates that permanent occupation began between 9500 and 9000 BP (5, 6). Regardless of when the peopling of high-elevation environments began, selective pressure on the human genome was likely strong–not only because of challenging environmental factors, but also because social processes such as intensification of subsistence resources and residential sedentism (7) promoted the development of agricultural economies, social inequality, and relatively high population densities across much of the highlands. European contact initiated an array of economic, social, and pathogenic changes (8). Although it is known that the peoples of the Andean highlands experienced population contraction after contact (8), its extent is debated from both archeological and ethnohistorical perspectives, especially concerning the size of the indigenous populations at initial contact (9).

To address hypotheses regarding the population history of Andean highlanders and their genetic adaptations, we collected a time series of ancient whole genomes from individuals in the Lake Titicaca region of Peru (fig. S1 and Table 1). The series represents three different cultural periods, which include individuals from (i) Rio Uncallane, a series of cave crevice tombs dating to ~1800 BP and used by fully sedentary agriculturists, (ii) Kaillachuro, a ~3800-year-old site marked by the transition from mobile foraging to agropastoralism and residential sedentism, and (iii) Soro Mik’aya Patjxa (SMP), an 8000- to 6500-year-old site inhabited by residentially mobile hunter-gatherers (10). We then compare the genomes of these ancient individuals to 25 new genomes and 39 new genome-wide single-nucleotide polymorphism (SNP) datasets generated from two modern indigenous populations: the Aymara of highland Bolivia and the Huilliche-Pehuenche of coastal lowland Chile. The Aymara are an agropastoral people who have occupied the Titicaca basin for at least 2000 years (11). The Huilliche-Pehuenche are traditionally hunter-gatherers from the southern coastal forest of Chile (12).

Table 1 Ancient sample sequencing results.

mtDNA, mitochondrial DNA; NA, not applicable.

View this table:

To explore the population history of the Andean highlands, we first assess the genetic affinities of the prehistoric individuals and compare them to modern South Americans and other ancient Native Americans. Second, we construct a demographic model that estimates the timing of the lowland-highland population split as well as the population collapse following European contact, and last, we explore evidence for genetic changes associated with selective pressures associated with the permanent occupation of the highlands, the intensification of tuber usage, and the impact of European-borne diseases.


Samples and sequencing

We performed high-throughput shotgun sequencing of extracted DNA samples from seven individuals from three archeological periods with proportions of endogenous DNA ranging from 2.60 to 44.8%. The samples span in age from 6800 to 1400 calendar years BP (table S1). The samples from the Rio Uncallane site showed the highest endogenous content; five were chosen for frequency-based analyses and sequenced to an average read depth of 4.74× (Table 1). The oldest sample is from the SMP site SMP5. The sample was directly dated to ~6800 years BP, exhibited the highest endogenous DNA proportion among burials from the same site (3.15%), and sequenced to an average read depth of 3.92×. The sample from the Kaillachuro site, at about 3800 years BP (culturally dated), was sequenced to an average read depth of 1.44×. All samples exhibited low contamination rates (1 to 4%) and DNA damage patterns consistent with degraded DNA (table S2 and fig. S2).

We also generated new data from two contemporary populations: 24 Aymara individuals from Ventilla, Bolivia, which is near Lake Titicaca (fig. S1), were sequenced to an average read depth of 4.36× (table S3), and an additional individual was sequenced to a read depth of 31.20× (see the Supplementary Materials). The modern Huilliche-Pehuenche (13) (n = 39) from the south of Chile (fig. S1) were genotyped on an Axiom LAT1 Array and imputed using the 1000 Genomes Project phase 3 (14) and the Aymara whole-genome sequence data as a reference panel. Analysis using the program ADMIXTURE (15) revealed that all the modern samples had trivial amounts of nonindigenous ancestry (less than 5%).

Genetic relationship between ancient and modern individuals

We used outgroup ƒ3 statistics to assess the shared genetic ancestry among the ancient individuals and 156 worldwide populations (16). C/T and G/A polymorphic sites were removed from the dataset to guard against the most common forms of postmortem DNA damage (17). Outgroup ƒ3 statistics of a worldwide dataset demonstrate that all seven ancient individuals from all three time periods display greater affinity with Native American groups than with other worldwide populations (Fig. 1). Ranked outgroup ƒ3 statistics suggest that the ancient individuals from all three time periods tend to share the greatest affinity with Andean living groups (Fig. 1A).

Fig. 1 ƒ3 statistics.

(A to C) Left: Heat map represents the outgroup ƒ3 statistics estimating the amount of shared genetic drift between the ancient Andean individuals and each of 156 contemporary populations since their divergence with the African Yoruban population. Right: Ranked ƒ3 statistics showing the greatest affinity of the ancient Andeans with respect to 45 indigenous populations of the Americas.

Principal components analysis reveals a tight clustering of the Rio Uncallane and Kaillachuro (K1) individuals, which overlays with modern Andean populations (Fig. 2). The oldest individual, SMP5, places closer to the intersect between the modern Andean groups of the Quechua and the Aymara. To further elucidate the relationship among the ancient Andean individuals and modern populations from South America, we examined maximum likelihood trees inferred with TreeMix (18). We observe that individuals from all three time periods form a sister group to the modern Andean high-altitude population, the Aymara (Bolivia; Fig. 3, A to C). The connection between the modern and ancient Andeans continues with the ADMIXTURE-based cluster analysis (Fig. 3D). Both the Rio Uncallane and K1 trace their genetic ancestry to a single component (shown in red), which is also shared by the modern Andean populations of the Quechua and the Aymara (Fig. 3D). SMP5 not only traces most of this ancestry to the same component but also exhibits a component (shown in brown) found in Siberian populations, specifically the Yakut. USR1, a 12,000-year-old individual from Alaska hypothesized to be part of the ancestral population related to all South American populations (19), also shares this component, as do previously published ancient individuals from North America (i.e., Anzick-1, Saqqaq, and Kennewick).

Fig. 2 Principal components analysis.

Principal components analysis projecting the ancient Andeans (IL2, IL3, IL4, IL5, IL7, K1, and SMP5), USR1 (Alaska) (19), Anzick-1 (Montana) (61), Saqqaq (Greenland) (62), and Kennewick Man (Washington) (63) onto a set of non-African populations from Raghavan et al. (16), with Native American populations masked for non-native ancestry. PC, principal component.

Fig. 3 Genetic affinity of ancient Andeans to global and regional indigenous populations.

(A to C) Maximum likelihood trees generated by TreeMix (18) using whole-genome sequencing data from the Simons Genome Diversity Project (46). (D) Cluster analysis generated by ADMIXTURE (15) for a set of indigenous populations from Siberia, the Americas, Anzick-1, Kennewick, Saqqaq, USR1, and the ancient Andeans. The number of displayed clusters is K = 8, which was found to have the best predictive accuracy given the lowest cross-validation index value. Native American populations masked for non-native ancestry. SE, standard error.

Demographic model

Given that the above analyses provide an inference of genetic affinity in the Andes extending over roughly 4000 years BP, and perhaps up to 7000 years, we explored models relevant to the timing of the first permanent settlements in the Andes, as well as the extent of the population collapse occurring after European contact in the 1500s (Fig. 4). We used a composite likelihood method [Fastsimcoal2 (20)], which was informed by the site frequency spectra (SFS) of multiple populations, including the Rio Uncallane (n = 5; high altitude, pre-European contact), the Aymara (n = 25; high altitude, post-European contact), and the Huilliche-Pehuenche (n = 39; lowland population, post-European contact). We infer a split between low- and high-altitude populations in the Andes occurring at 8750 years [95% confidence interval (CI), 8200 to 9250]. We also infer the magnitude of the population collapse in the Andes after European contact to be a 27% reduction in effective population size and occurring 425 years ago (95% CI, 400 to 450). The model also inferred the collapse experienced by the Mixe (included in the model for the split between Northern and Southern Hemisphere populations) and the Huilliche-Pehuenche after European contact, which were much more severe. We infer a 94% (95% CI, 0.94 to 0.96) collapse for the Mixe and a comparable 96% (95% CI, 0.95 to 0.96) collapse for the Huilliche-Pehuenche. The collapse difference between high- and low-altitude populations from the Andes region is significant [Huilliche-Pehuenche versus Aymara, P < 0.0001 (95% CI, 0.66 to 0.71), χ2 test].

Fig. 4 Demographic model of the Andes.

ILA, Rio Uncallane.

However, we were concerned with the heterogenous nature of the data used in the model, since the Huilliche-Pehuenche data were imputed from an SNP array. Therefore, we calculated heterozygosity on each population using an overlapping set of ascertained SNPs from the SAN. We found a similar pattern of effective population size changes, using heterozygosity as a proxy, as we did with the demographic model (Supplementary Materials and Methods and fig. S3).

Adaptive allele frequency divergence

Next, we turn to detecting the impact of natural selection. Permanent settlement of high-altitude locales in the Andes likely required adaptations to the stress of hypobaric hypoxia, such as those detected in Tibetan, Nepali, and Ethiopian highland populations (2124). Several genetic studies have also been performed on modern populations of the Andes (2528). The availability of genetic variation data from an ancient Andean population, i.e., the Rio Uncallane samples, offers new opportunities for investigating adaptations in this region of the Americas. More specifically, the ancient data allow us to test for adaptations in a population that was not exposed to the confounding effects associated with the European contact, i.e., the demographic bottleneck and the exposure to additional strong selective pressures due to the introduction of new pathogens (29). Therefore, we test for high allele frequency divergence resulting from adaptations to the Andean high-altitude environment, using a whole-genome approach with the five Rio Uncallane individuals (~1800 BP), and we contrast them with a lowland population, the Huilliche-Pehuenche from Chile, who likely inhabited the region for several thousand years before the arrival of the Spanish (30). We used the Population Branch Statistic (PBS) (24) to scan for alleles that reached high frequency in the Rio Uncallane relative to the Huilliche-Pehuenche. The Han Chinese from the 1000 Genomes Project (14) were used as the third comparative population. The SNPs showing the most extreme PBS score and, therefore, the highest differentiation in the ancient Andeans correspond to a region near MGAM (Fig. 5A and table S4). MGAM is an intestinal enzyme associated with starch digestion (31) and could possibly be linked to the intensification of tuber use and ultimately the transition to agriculture (7, 32). The second strongest signal came from the gene DST, which encodes a cytoskeleton linker protein active in neural and muscle cells (33). DST has been associated with differential expression under hypoxic conditions (34) and cardiovascular health (35).

Fig. 5 Selection scans.

(A) PBS scans testing for differentiation along the Ancient Andean branch, with respect to the lowland population of the Huilliche-Pehuenche. (B) PBS scan testing for differentiation along the modern Aymara branch (post-European contact), with respect to the precontact ancient population of the Rio Uncallane. Selected global SNP frequencies (Frq) (64) and histone modifications (38) are shown on the right, which are relevant to the strongest selection signals. “*” indicates closest gene. GI, gastrointestinal tract; NK, natural killer.

The second test for selection targeted hypotheses concerning the adaptation to European-introduced pathogens (Fig. 5B). The PBS was conducted with the Aymara (post-European contact), the Rio Uncallane (pre-European contact), and the Han Chinese (14). The strongest selection signal comes from SNPs in the vicinity of CD83, which encodes an immunoglobulin receptor and is involved in a variety of immune pathways, including T cell receptor signaling (36). The gene has also been shown to be up-regulated in response to the vaccinia virus infection (37), the virus used for smallpox vaccination. In addition, the differentiated SNPs associated with the CD83 signal show a variety of chromatin alterations in immune cells, including monocytes, T cells, and natural killer cells (38), suggesting that they may be functional. Last, a gene in the top 0.01% of the scan, IL-36R, codes for an interleukin receptor that has been associated with the skin inflammatory pathway and vaccinia infection (39).


South America is thought to have been populated relatively soon after the first human entry into the Americas, some 15,000 years ago (3). However, steep environmental gradients in western South America would have posed substantial challenges to population expansion. Among the harshest of these environments are the Andean highlands, which boast frigid temperatures, low partial pressure of oxygen, and intense ultraviolet radiation. Despite this, however, humans eventually spread throughout the Andes and occupied them permanently. Archeological evidence suggests that hunter-gatherers entered the highlands as early as 12,000 years BP (4), with permanent occupation beginning around 9000 years BP (57). The evidence presented here indicates genetic affinity between populations from different time periods in the high-elevation Lake Titicaca region from at least 3800 years BP, and possibly 7000 years BP. This affinity extends to the present high-altitude Andean communities of the Aymara and Quechua.

Although our samples do not extend beyond 7000 years BP, we were able to model the initial entry into the Andes after the split between North and South American groups. Our model, using a mutation rate of 1.25 × 10−8, shows a correlation with archeological evidence regarding the split between North and South groups occurring nearly 14,750 years ago (95% CI, 14,225 to 15,775), which agrees with the oldest known site in South America of Monte Verde in southern Chile (~14,000 years BP) (3). The date for the split between low- and high-altitude populations was inferred to 8750 years (95% CI, 8200 to 9250), which is younger than previously reported by a study using modern genomes alone (40). This date provides a terminus ante quem time frame for the origins of adaptations known in modern highland populations.

We also present evidence for genes that may have been under selective pressure caused by environmental stressors in the Andes. None of our most extreme signals for positive selection were related to the hypoxia pathway. Instead, we find differentiated SNPs in the DST gene, which has been linked to the proper formation of cardiac muscle in mice (35, 41). Furthermore, the DST intronic SNP that was most differentiated (rs149112613) shows histone modifications associated with blood and the right ventricle of the heart (38). This correlates to Andean highlanders tending to have enlarged right ventricles associated with moderate pulmonary hypertension (1). This finding also parallels hypotheses proposed by Crawford et al. (42) that Andeans may have adapted to high-altitude hypoxia via cardiovascular modifications.

The most extreme signal may represent adaptations to an agricultural subsistence and diet. The top-ranked gene, MGAM, is associated with starch digestion (43). The associated high-frequency SNPs in the ancient Andean population (table S4) exhibit chromatin marks in cells from the gastrointestinal tract (Fig. 5A). The variant may be highly differentiated between the ancient Andeans and the lowlanders (the Huilliche-Pehuenche) because of differences in subsistence strategies. The Huilliche-Pehuenche individuals are traditionally hunter-gatherers, with archeological evidence suggesting that their ancestors have been practicing this mode of subsistence for thousands of years in the region before European contact in the 1500s (44). In contrast, the Andes is one of the oldest New World centers for agriculture, which included starch-rich plants such as maize (~4000 years BP) (45) and the potato (~3400 years BP) (7). Selection acting on the MGAM gene in the ancient Andeans may represent an adaptive response to greater reliance upon starchy domesticates. Recent archeological findings based on dental wear patterns and microbotanical remains similarly suggest that intensive tuber processing and thus selective pressures for enhanced starch digestion began at least 7000 years ago (7, 32). Furthermore, we see a similar signal (top 0.01%) when we contrast the hunter-gatherers from Brazil [Karitiana/Surui, sequence data (46)] with the ancient Andeans, as well as with the Aymara versus the Huilliche-Pehuenche and the Karitiana/Surui. One further note, we did not detect amylase high copy number in the ancient Andes population before European contact, suggesting a different evolutionary path for starch digestion in the Andes when contrasted with Europeans (47).

Selection with respect to the environment in the Andes is not limited to the ancient past. In 1532, the environment radically changed with the arrival of the Spanish (29). Not only were long-standing states and social organizations disrupted, but also the environment itself was altered with the arrival of European-introduced pathogens, which may have preceded the arrival of the Spanish via trade routes (29). Some of the most devastating epidemics were related to smallpox, occurring in the 1500s and 1600s (8). These combined factors are thought to have decimated the local populations (8). We inferred the population decline in the Andes, using the ancient and modern Andeans, and found the decline in effective population size to be 27% (95% CI, 0.23 to 0.34). We also simulated DNA sequence data immediately before the collapse between the Rio Uncallane and the Aymara using a truncated model and found a reduction in average heterozygosity of 23% (see the Supplementary Materials). This is a modest decline compared to archeological and historical estimates, which reached upward of 90% of the total population (48). In contrast, the model infers a much more severe collapse for lowland Andean populations of the Huilliche-Pehuenche, exceeding 90%. We also explored alternative scenarios to make sure that the model was not biasing the inferred collapse and found that the estimated population size reduction remained significantly less severe in the highland compared to the lowland populations (see the Supplementary Materials). Although we did not have precontact ancient samples for these populations in Chile to inform the model, the large difference suggests that high-altitude populations may have suffered a less intense decline compared to the more easily accessible populations near sea level in the Andes region. This is also supported by long-lasting warfare in the Chilean lowland region with the Spanish that lasted well into the 19th century (29).

Our data also show that the populations in the modern Andes have high genetic affinity with the ancient populations preceding European contact. Although a strict continuity test (49) that does not allow for recent gene flow was not significant (see the Supplementary Materials), modern Andeans are likely the descendants of the people that suffered the epidemics described in historical texts. In the Andes, missionary reports suggest that disease may have arrived before formal Spanish contact in 1532 (50) and that the first epidemics were likely caused by smallpox (29). We infer that selection acted within the past 500 years on the immune response, making it likely that modern Andeans descend from the survivors of these epidemics. The selection scan along the branch of the modern Andeans, contrasted with the ancient group, revealed the strongest signal to be associated with an immune gene connected to smallpox, CD83 (37). The second most highly differentiated SNPs were in the vicinity of RPS29, which codes for a ribosomal protein and is involved in viral mRNA translation and metabolism, including those of influenza (51). Another top gene, IL-36R (rank #36), is thought to have evolved alternative cytokine signaling to compensate for viruses, such as pox viruses, that can evade the immune system (52). Furthermore, the top SNP associated with IL-36R (rs1117797) exhibits a QTL (quantitative trait locus) signal associated with IL18RAP, a gene involved in mediating the immune response to the vaccinia virus (53). The relative strength of these signals and the role played by the associated genes may indicate that selection favored alleles that directly affected the pathogenicity of the diseases encountered by the ancestors of the epidemic survivors.

In conclusion, human adaptation to the Andean highlands involved a variety of factors and was complicated by the arrival of Europeans and the marked changes that followed. Despite harsh environmental factors, the Andes were populated relatively early after entry into the continent. The adaptive traits necessary for permanent occupation may have been selected for in a relatively short amount of time, on the order of a few thousand years. Given the multifaceted nature of the adaptation, we are not surprised to find genetic affinity in the populations of the Andes dating to at least 4000 years BP, and possibly extending to 7000 years BP.


Community engagement

Although the communities in this study are not structured around tribal leadership, as in North America, all samples were collected by indigenous representatives and/or local researchers. For the Aymara population, the samples were collected by E. Vargas Pacheco and M. Villena, two representatives of the Aymara community. The Huilliche-Pehuenche samples correspond to a historical collection of modern DNA samples at the Program of Human Genetics of the University of Chile. All of the participants were healthy individuals, over 18 years of age, and gave their informed consent. The study was approved by the Ethics Committee of the Faculty of Medicine, University of Chile. M.Ap. helped analyze the data while visiting at the University of Chicago.

DNA extraction

We used standard ancient DNA extraction methods following stringent guidelines to work with ancient human remains and conducted these in a dedicated ancient DNA laboratory at the Laboratories of Molecular Anthropology and Microbiome Research at the University of Oklahoma, following Dabney et al. (54) with the following modifications. After decontaminating each tooth with bleach and removing the outer layer of cementum, the whole tooth was crushed. The resulting bone powder was predigested in 1 ml of 0.5 M EDTA for 15 min at room temperature (RT) to remove loose contaminants. The supernatant was then removed, and the sample was digested in additional 1 ml of 0.5 M EDTA at RT overnight. One hundred microliters of QIAGEN proteinase K was added to the samples the following day and incubated for 8 hours at 37°C. The supernatant was then removed, and an additional 0.5 m of EDTA was added along with 50 μl of proteinase K for 3 to 5 days, until the powder was no longer mineralized. Silica column–based extraction protocol was then followed as described in (54). DNA was extracted from molars belonging to individuals IL2, IL3, IL4, IL5, IL7, K1, and SMP5.

Library construction

Libraries were constructed from 30 μl of DNA extract as detailed by (55), with the following modifications. Blunt-end repair was performed with the NEBNext Quick DNA Library Prep Master Mix Set for 454 following the manufacturer’s protocol and deactivating the Bst enzyme at 80°C for 20 min. The libraries were amplified with KAPA HiFi U+ kit, using 10 μl of DNA, 25 μl of KAPA ReadMix (2×), 11 μl of H2O, 1 μl of bovine serum albumin (2.5 mg/ml), and 1.5 μl of the IS5 and IS7 primers (each). The thermocycler program was as follows: initial denature at 95°C for 5 min, (98°C for 20 s, 60°C for 15 s, and 72°C for 30 s) for X cycles, final extension at 72°C for 5 min, and hold at 4°C. Cycles were determined via quantitative polymerase chain reaction using 1 μl of the preamplified library to determine the amplification curve.

Whole-genome capture

Given the low endogenous DNA of K1 and SMP5, we performed a whole-genome capture using the myBaits whole-genome enrichment kit enhanced with protocol modifications recommended for the study of ancient DNA (56). The two enriched libraries were then pooled and sequenced (PE125) one lane of an Illumina HiSeq 4000, located at the University of Chicago Genomics Facility.

Sample sequencing and SNP array

Twenty-four Aymara individuals underwent shotgun sequencing, multiplexing six individuals on eight lanes, on the Illumina HiSeq 4000, PE125, located at the University of Chicago Genomics Facility. The average coverage for the 24 individuals, after bioinformatics processing, was 4.36 (table S3). One additional Aymara was sequenced to high coverage, 30×, using the HiSeq X (PE150) at Macrogen Labs ( Thirty-nine Huilliche-Pehuenche individuals were genotyped via the Axiom LAT1 Chip (Affymetrix, Santa Clara, CA), and 22 of these were previously published (13). The ancient individuals were sequenced with the HiSeq X (PE150), one lane each, at Macrogen Labs. For this round of sequencing, SMP5 and K1 were not enriched.

Test for strict genetic continuity

Given that the above analyses provide an inference of temporal genetic affinity in the Andes from roughly 4000 years BP, and perhaps extending to 7000 years, we tested for strict continuity. Here, continuity is defined as a genetic affinity between two temporal populations without external gene flow. We performed the test for continuity using the method described in Schraiber (49). We find that when the Rio Uncallane individuals are tested against the 25 modern Aymara, the test fails to show strict continuity (230,649 SNPs: t1 = 0.00443, t2 = 0.04189; log LRT P = −∞). However, this test might be inappropriate for the region, since it assumes no outside gene flow between populations. Several civilizations rose and fell in the region between the two time periods represented by the ancient and modern populations. These civilizations, such as the Wari and Incas, built elaborate trade networks throughout the Andes, and it is likely that the ancestral populations leading to the Aymara experienced regional admixture before European contact.

Demographic history model

Parameters for the demographic model (Fig. 4) were inferred with FastSimCoal2 v. (20). For the inferred values, 100 optimizations were run, taking the best-likelihood parameters from each. The inferences used a per-base per-generation mutation rate of 1.25 × 10−8 and joint derived SFS for the Mixe (46), Illave, Aymara, and Huilliche-Pehuenche, which were estimated using ANGSD (analysis of next-generation sequencing data) (57). This SFS contained monomorphic and polymorphic sites based on hg19 that were limited to regions thought to be neutrally evolving, which were identified using the NRE (neutral region explorer) tool (58). A parametric bootstrapping approach was used to construct the 95% CIs. Specific parameters are listed in the Supplementary Materials.

Copy number variation analysis

We performed a copy number variation analysis using CNVcaller (59) on the Rio Uncallane ancient population (an agriculture-based culture, pre-European contact), using USR1 as an outgroup (an 11,500-year-old ancient individual from Alaska found to be ancestral to Native American populations). We found no evidence of AMY1 high copy number in the Rio Uncallane, which may suggest a different evolutionary path for starch digestion in the Andes when compared to Europeans. Furthermore, we only found AMY1C to have two copies, while the other AMY1 genes showed zero copies. Previous studies, focusing on European populations, found a range of 2 to 18 copies, with an average of six copies per person [see (60) and references therein].


Supplementary material for this article is available at

Supplementary Materials and Methods

Fig. S1. Location of ancient and modern samples.

Fig. S2. DNA damage patterns for the ancient individuals.

Fig. S3. Heterozygosity rate calculated with overlapping ascertained SNPs from the SAN panel of the Affymetrix Human Origin Array.

Fig. S4. Maximum likelihood trees with m = 1 migration events.

Fig. S5. Cluster analysis generated by ADMIXTURE.

Table S1. Radiocarbon determinations on five individuals from the Ilave region.

Table S2. Contamination estimates.

Table S3. Modern Aymara sequencing results.

Table S4. Positive selection candidates associated with the Andean highlands and pathogen response.

Table S5. Selection scan top signals.

Table S6. Demographical model parameters.

References (6988)

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We would like to thank J. Blangero and S. Blangero for contribution with the Aymara samples. We would like to thank C. Jeong and S. Nakagome for helpful discussions throughout the project. We would like to thank M. DeGiorgio for helping with the ƒ3 statistic analysis. Field support for R.H. was provided by the Collasuyo Archaeological Research Institute, C. Justo Chavez, V. Incacoña Huaraya, M. Incacoña Huaraya, N. Condori Flores, A. Pilco Quispe, D. Pilco Incacoña, D. Pilco Incacoña, K. Pilco Incacoña, L. M. Pilco Incacoña, L. Hayes, and the community of Mulla Fasiri, Peru. Archeological data recovery at SMP and international export of artifacts, including those from Jiskairumoko, Kaillachuro, and the Rio Uncallane sites, were carried out under Peruvian Ministry of Culture permit nos. 064-2013-DGPA-VMPCIC/MC and 138-2015-VMPCIC/MC. Multiple permits were issued by the Peruvian Instituto Nacional de Cultura for research at Jiskairumoko, Kaillachuro, and the Rio Uncallane sites from 1994 to 2002. See Moreno-Mayar et al. (67) and Posth et al. (68) for related analyses of ancient DNA samples from the Americas. Funding: This work was supported in part by NIH grant R01HL119577 and the National Science Foundation grants BCS-1528698 (awarded to M.Al., C.W., and A.D.R.) and BSC-9221724 (awarded to C.B.). J.L. was funded by a University of Chicago Provost’s Postdoctoral Scholarship. Support for archeological excavation and artifact analysis was provided to R.H. by the National Science Foundation (BCS-1311626), the American Philosophical Society, and the University of Arizona. Survey and data recovery at the Rio Uncallane sites was supported by grants to M.Al. from the National Geographic Society (5245-94) and the H. John Heinz III Charitable Trust. Excavations at Jiskairumuko and Kaillachuro were supported by grants to M.Al. from the National Science Foundation (SBR-9816313 and SBR-9978006). The Huilliche-Pehuenche datasets were funded by the Chilean National Council of Science and Technology (CONICYT) grants USA2013-0015 and FONDEF D10I1007. Author contributions: M.Al., R.H., C.B., J.T.W., and C.V.L. provided samples for the study. J.L., A.D.R., C.W., and C.H. contributed to the experimental design. M.Ap., J.L., A.D.R., J.N., and D.W. analyzed data. M.M. and R.A.V. generated the Huilliche-Pehuenche datasets. J.L., A.D.R., and M.Al. wrote the initial draft of the manuscript. C.W. and R.H. contributed to the writing of the manuscript. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors. Ancient and modern DNA sequences are available from NCBI Sequence Read Archive, accession no. PRJNA470966. The Huilliche-Pehuenche SNP data will be available via a data access agreement with R.A.V. at the Universidad de Chile.

Stay Connected to Science Advances

Navigate This Article