Research ArticleECOLOGY

Late Quaternary horses in Eurasia in the face of climate and vegetation change

See allHide authors and affiliations

Science Advances  25 Jul 2018:
Vol. 4, no. 7, eaar5589
DOI: 10.1126/sciadv.aar5589


Wild horses thrived across Eurasia until the Last Glacial Maximum to collapse after the beginning of the Holocene. The interplay of climate change, species adaptability to different environments, and human domestication in horse history is still lacking coherent continental-scale analysis integrating different lines of evidence. We assembled temporal and geographical information on 3070 horse occurrences across Eurasia, frequency data for 1120 archeological layers in Europe, and matched them to paleoclimatic and paleoenvironmental simulations for the Late Quaternary. Climate controlled the distribution of horses, and they inhabited regions in Europe and Asia with different climates and ecosystem productivity, suggesting plasticity to populate different environments. Their decline in Europe during the Holocene appears associated with an increasing loss and fragmentation of open habitats. Europe was the most likely source for the spread of horses toward more temperate regions, and we propose both Iberia and central Asia as potential centers of domestication.


The distribution of the wild horse, Equus ferus, drastically changed during the past 50 thousand years (ka) (1 ka = 1000 years). The species was extirpated from its native American range around 12 to 13 ka before the present (B.P.) (1). In Eurasia, it survived as fragmented populations in small environmental pockets after the climatic instability and subsequent warming that characterizes the Pleistocene/Holocene transition [11.7 ka B.P. (2)]. The paleontological record shows that horse populations declined in large numbers after the Last Glacial Maximum (LGM; 27 to 18 ka B.P.), and remarkably limited osseous material has been found during the early Holocene (3, 4). It was only several millennia later, during the third millennium BCE (Before the Common era), that the species expanded again, after their early domestication in the Pontic-Caspian region and central Asian steppes during the Eneolithic (57).

Together with a subsistence economy almost exclusively oriented toward horse meat, evidence for milking and harnessing, and the possible presence of corral enclosures support the Botai Eneolithic culture as the first to have domesticated horses some ~5.5 ka ago (6, 7). Earlier dates, especially based on the presence of horse figurines in the nine burials of S’yezzhe, Russia (5), have been proposed together with alternative domestication centers, for example, in eastern Anatolia (8, 9) and Iberia (10, 11).

An interdisciplinary study (12) previously analyzed the parallel changes observed in the horse demography [as reconstructed from Bayesian skyline plots on mitochondrial DNA (mtDNA) data (13)] and their climatic envelope [as modeled from species distribution modeling (SDM) (14)]. It suggested climate change as a possible driver of the horse population dynamics. More specifically, while these Bayesian skyline plots exploited patterns of genetic variation within short ancient DNA sequences recovered from 128 fossil specimens spread across Eurasia, SDM leveraged its analysis on the availability of multiple paleoclimatic data throughout Eurasia during four different time periods (42, 30, 21, and 6 ka B.P.) to capture the climatic envelope of the species. The potential range of the species was then predicted to map all geographic regions and time periods where similar climatic conditions prevailed. Further Bayesian skyline reconstructions based on additional ancient mtDNA data (15, 16) and pairwise sequential Markov coalescent (17) modeling of complete genomes confirmed a demographic trajectory marked by a first expansion peaking before the LGM, followed by a population collapse during the LGM, in accordance with an increase and subsequent decrease in the climatically suitable area modeled based on SDMs.

The time windows considered in the abovementioned study, in particular, the large temporal gaps consisting of 8 to 15 ka, represent one of its major limitations. In addition, no detailed analysis of horse habitats was carried out, especially the different biomes occupied and their dynamics through time. Finally, the number of paleontological finds associated with unambiguous radiocarbon dates was relatively limited, which precluded independent modeling of climatic niches in Europe and Asia. Horses from both continents, however, may have shown different, specific biological adaptations.

More recently, SDM analyses have been used to characterize the Holocene distribution of horses in Europe concerning vegetational changes (18), on the basis of 89 radiocarbon dates from 11.5 to 5.5 ka ago, mostly located in northern and central Europe. Paleovegetation reconstructions were carried out on a coarse scale (100 × 100 km2) using pollen records. These analyses suggest an intensification of horse occurrences in closed (forested) environments during the period, which is strikingly paralleled by an increase in frequency of genetic variants coding for darker coat color, potentially representing an adaptive trait to forest environments. This model, however, has been recently challenged on the basis of archeological information (19).

Here, we compiled an extensive continental-scale database, consisting of 3070 radiocarbon dates associated to horse paleontological and archeological finds across the whole of Eurasia, that has been analyzed in association with coarse-scale paleoclimatic reconstructions. We further collected the number of identified specimens (NISP) frequency data for horses versus other ungulates in 1120 archeological layers in Europe (tables S1 and S2, respectively). Those two lines of evidence are complementary because the spatial resolution of faunistic assemblages is at a much higher resolution than the paleoclimatic and vegetational reconstructions and make it possible to disentangle micro- and macroenvironmental changes.

This massive amount of data allowed us to track, with unprecedented details, how the geographic distribution of the species changed through time. We characterized the past climatic envelope of the species, and of those horse fossil remains in Europe and central-north Asia, by aligning their spatiotemporal distributions to four paleoclimatic variables simulated by an atmospheric coupled ocean general circulation model (AOGCM) HadCM3 (Hadley Centre coupled model, version 3) (20). This allowed us to run species distribution models to understand how the potential range of the horse responded in the face of major climate change. We then used the reconstructions of climatic variables in the past to infer the vegetational coverage at a macroclimatic scale.

Our results show that climate determined geographical range dynamics, including the reconfiguration and fragmentation of horses’ distribution during the Holocene in Europe as a consequence of climate-driven vegetation changes. Moreover, Asiatic and European populations were adapted to different environmental conditions, as predicted by theories for species with wide ranges. As the European horse appears to have developed a wider climatic niche than the Asian horse, we speculate that the former, and not the latter, represents the most likely source for the spread of horses across all temperate areas, as previously suggested (21). Finally, Iberia arises as a highly likely center for domestication of horses in Europe.


Horses experienced significant changes in their geographic range through time

For most analyses, the data have been divided into climatic periods: pre-LGM (older than 27 ka B.P.), LGM (27 to 18 ka B.P.), Late Glacial (18 to 11.7 ka B.P.), Preboreal (11.7 to 10.6 ka B.P.), Boreal (10.6 to 9.1 ka B.P.), Early Atlantic (9.1 to 7.5 ka B.P.), Late Atlantic (7.5 to 5.5 ka B.P.), and Recent (younger than 5.5 ka B.P.) (Fig. 1, A and B). The spatial and temporal distribution of horse remains compiled in our database reveals a strong imbalance in Eurasia (Fig. 1, A and B). This may be related to uneven research efforts and fossil preservation differences across the whole landscape, or the more productive conditions in Europe may have favored higher densities of horse populations. To minimize sampling biases, we restricted our database to a single observation per geographic grid cell and time bin considered, resulting in 321 entries for Asia and 1019 entries for Europe, and tested the effect of data availability in model outputs (see Materials and Methods and Supplementary Materials).

Fig. 1 Horse occurrences through time.

(A) Horse occurrences through time. Histograms showing the number of horse observations in Europe (left panel) and Asia (right panel) for each time bin (top) and for climatic period (bottom). Only time bins with more than 10 observations (black horizontal line) have been considered for the SDM analyses. From 22 ka B.P. backward (gray vertical line), time bins cover 2 ka following the available paleoclimatic reconstructions. The central map shows the boundaries considered while defining European and Asian regions, with the black line representing the Urals. The zoomed area shows the geographical resolution of the climatic reconstructions, with each pixel representing a grid cell. (B) Geographic distribution of horse occurrences. Maps showing horse occurrences for each climatic period in Europe (left) and Asia (right).

We found a common trend in both regions for a high number of occurrences at the end of the Pleistocene (with a decrease during the LGM, only visible in Europe), followed by a drastic reduction in the Early and Middle Holocene, and a relative increase toward more recent times. These included both the Early Atlantic in Europe, which started ~9.1 ka B.P., and the time range after 5.5 ka B.P. for Asia.

The horse fossil record appears ubiquitous throughout Europe in the Late Pleistocene, while in the Early and Middle Holocene the finds are concentrated in central western Europe and Iberia. From 7.5 ka B.P., the number of finds increases markedly, and the geographical distribution extends toward the east and southeast (Fig. 1). The majority of Asian finds are located in northern Siberia and between southern Russia and Mongolia, and the geographic distribution of fossils collected appears steady in all periods examined.

Asian and European horses had different climatic niches

To explore the climatic niche of Eurasian horses, we first performed a principal components analysis (PCA) (Fig. 2A) on the climatic conditions sampled in randomly selected European and Asian locations (lighter points) with those found in areas occupied by horses (darker points). This analysis revealed that, in both continents, horses occupied only a portion of the climatic space available. The range covered by random locations shows that the paleoecological conditions present in Europe were only a subset of those found in Asia. However, European horses occupied a much wider climatic space than in Asia, with only limited overlap between the two ranges. This pattern holds true for the first two components, which reflect the combination of net primary productivity (NPP) and minimum/maximum temperatures, on the one hand, and total annual precipitation, on the other hand. In addition, and using a quantitative approach to estimate niche overlap among populations (22), we found that overlap of the environmental niches realized by both subpopulations is low [0.106; Fig. 2B; D ranging from 0 (indicating no overlap) to 1 (indicating full niche overlap)]. Overlap remains limited after a random sampling of occurrence data in the Europe data set for each time interval in numbers equals those available in Asia (Fig. 2C). This indicates that the observed pattern is robust to geographical sampling bias and is thus not driven by the unbalanced sampling present in the database between both continents. Analysis of NPP values in the areas occupied by horses in the two continents (Fig. 2D) also confirms this separation, as in Europe horses appear to have inhabited areas with higher productivity than in Asia [P < 2 × 10−16, analysis of variance (ANOVA)].

Fig. 2 Climatic niche of European and Asian horses.

(A) PCA of the estimated values of four environmental variables: Annual minimum temperature (tmin), mean temperature in the warmest month (tmax), total yearly precipitation (totprec), and mean annual NPP. The lighter points represent locations randomly sampled from the two continents (100 per time bin), while the darker points show locations where horses have been found. (B) Niche representation of the horse populations in Europe and Asia, in a PCA environmental space. The red and dark blue surfaces represent the European and Asian realized environmental niche of the horse, respectively. The overlap among the two, in purple, represents the common conditions occupied by the two populations. The arrow shows the centroids of the two realized niches. The continuous lines show the total environmental space of Europe (in red) and Asia (in dark blue) across all time intervals. Axes represent the two first principal components and the respective explained variance of the total environmental space. D shows the overlap of the niche of each horse subpopulation. (C) Histogram of the niche overlap values of the resampling analysis. Bars represent counts of each niche overlap value in 100 repetitions. The x axis shows the realized environmental niche overlap of the European-Asian horse subpopulations, while the dotted line represents the original estimate obtained using all European occurrences. (D) NPP in locations where horses have been found in Europe (highest series) and Asia (lowest series).

Horses likely conquered temperate environments from a European source

We next used our characterization of the horse climatic envelope based on three subsets of the data (European, Asian, or pan-Eurasian occurrences) to reconstruct how the climatic suitability of horses changed over the past 44 ka. For each grid cell of the region analyzed, we recovered a value p-Hor, summarizing the climatic suitability for horses (ranging from 0 to 1). The ability of our models to predict the occurrences of horses given climate is high across different data sets and algorithms (figs. S6 to S8), proving a key role of climate on horse biogeography.

Predictions based on the whole data set (“Eurasia” in Fig. 3A) show a predominance of horses in Europe and southern Siberia. Spatial projections based on presence data in Europe alone and Asia alone show that almost all Eurasia offered a suitable climate for horses, except most of Scandinavia, the Alps, the Pyrenees, the Gobi area, and, to a lesser extent, Iran and Afghanistan (Fig. 3A). It is noteworthy that projections based on Asian data indicate that suitable climatic conditions were distributed throughout the whole Holarctic region and up to Central Europe, while those based on European data alone fill all temperate and warmer areas, including southern Asia and Africa (Fig. 3, A and B). There is no evidence of climatic barriers between those two populations through time because the forecasts from Europe and Asia always overlap in central Eurasia, except 5 ka B.P. (figs. S3 and S4). An alternative explanation is the role of the Urals as a potential constraint for the dispersal of horses between Europe and north central Asia.

Fig. 3 Climatic suitability.

(A) Cumulative climatic suitability for the past 44 ka based on simulation on the European (left), Eurasian (middle), and Asian (right) data sets. To correct for sampling bias in the Eurasian data set, for each time slice, all estimates and projections for Eurasia are performed considering 100 random resampling of European occurrences in the same number as Asian occurrences. The darker the colors, the more stable the climatic suitability for horses (climatic niche = p-Hor) through time. (B) Projection of climatic suitability across Eurasia in different climatic periods based on occurrences in Europe (left), Eurasia (middle), and Asia (right). Because of the scarcity of data available for Asia, no models for the Holocene have been possible for both Asia and Eurasia, with the exception of 5 and 3 ka B.P. (both included in the “Recent” period).

Climatic and habitat association patterns for horses in Europe support increasing habitat fragmentation

At the beginning of the Holocene, the number of occurrences and the frequency of horse remains in Europe dropped sharply (Figs. 1A and 4, A and B) (3, 4). This is generally interpreted as the consequence of the demise of steppe-tundra landscape and return of forests, owing to climate warming, which would offer less suitable habitats for horses (3). Adaptation to closed habitats has also been suggested (18), on the strict basis of horse occurrences located mostly in northern Europe and not using frequency data, and has been recently challenged (19).

Fig. 4 Analyses of the European data set and biome frequency.

(A) Distribution through time of the frequency of horse remains in Europe calculated as NISP of horses versus other ungulates. (B) Density of horse remains through time in Europe, calculated as NISP of horses versus other ungulates. The numbers at the bottom of each bar represent the number of observations falling in each class, from 0 to >5%. (C) Climatic suitability for horses in Europe between 10 and 3 ka B.P. (D) Climatic suitability per time period. Percentage of land cells in Europe with a value of suitability for horses (p-Hor) > 0.5 and p-Hor > 0.8. (E) Holocene climatic amelioration. Difference in p-Hor in Europe comparing five successive time bins during the Holocene: 9, 8, 7, 6, and 5 ka B.P. Each map shows the difference in the more recent distribution compared to the previous one. (F) Environmental reconstructions in the macro area surrounding horse finds in Europe (left) and Asia (right) per climatic period. The lighter the color, the less forested is the region. The numbers at the bottom of the bars show the number of occurrences in closed environments over all the observations. The dotted line represents a frequency of 0.5.

The decrease of horse remains in Europe is not characterized by a geographic reduction in the overall extent of the area occupied by the species but in a drop of frequencies in a geographic extent that does not vary much between the Late Glacial and the Early Atlantic (Figs. 1B and 4B). This pattern is more likely to result from habitat fragmentation than from a geographic shift in the climatic range suitable for the species, as observed for many animals during the LGM (23).

We next mapped the spatial distribution of suitable climatic range for every millennium between 10 and 3 ka B.P. (Fig. 4C) and calculated the percentage of land space climatically most suitable to horses using the grid cells with p-Hor values greater than 0.8 (Fig. 4D). In addition, we calculate the percentage of land space climatically likely suitable to horses, considering grid cells showing p-Hor values greater than 0.5.

In the whole period ranging from the Preboreal (11.7 to 10.6 ka B.P.) to the Late Atlantic (7.5 to 5.5 ka B.P.), the total amount of land space most and likely suitable to horses is wider than in the Late Glacial, and only between 8 to 7 ka ago the European range appears patchy and fragmented (Fig. 4C). When comparing each of four successive time bins during the Holocene (8, 7, 6, and 5 ka B.P., respectively) (Fig. 4E), the difference in successive p-Hor values in Europe shows that the suitability for the species in Iberia, northeastern France, Italy, the Balkans, and eastern Europe steadily increased, while in Central Europe strong differences can be observed between neighboring regions.

We further analyzed the vegetational coverage through time (Fig. 4F), inferred on a coarse scale (1° × 1°, equivalent to ~100 km) from the paleoclimatic reconstructions. The analysis of these estimates for the locations where horses have been found shows that, in Asia, 96% of the occurrences across time intervals (309 of 321) are in macro areas predominantly corresponding to open landscapes, although we caution that the limited amount of data collected for most of the Holocene precludes any robust inference for this period. In Europe, as suggested previously (18), a minimum of 16.2% (21 of 130) and up to 58.8% (124 of 211) of the occurrences appear in macroregions with predominantly forested landscapes. After a minimum during the LGM, from the Late Glacial, these environments account for at least 50% (11 of 22 during the Boreal period) of horse occurrences.

Taken at face value, this pattern would suggest that horses were not restricted to open environments but could equally well inhabit closed, forested environments, as previously suggested (18). However, as others recently emphasized (19), the faunal associations in Holocene sites from Europe suggest a different pattern. The PCAs based on faunal assemblages (figs. S1 and S2) separate on the second principal component sites characterized by ungulates associated to forested areas (red deer, wild boar, and roe deer) and all other animals, associated to semi-open and open environments, including horses for most records.

Together, the contrast between the reconstructed microscale and macroscale vegetable coverage in Europe, the increase of horses in mainly forested macroregions, and the spatial pattern of extinction suggest that, from the beginning of the Holocene, the suitable environment became more and more patchy, with open areas increasingly fragmented by forests, where wild populations of horses could have survived in isolation until one or several waves of arrivals of domestic horses, leading to either local admixture or a full replacement of the preexisting local populations.

Iberia and the central Asian steppes harbored suitable environments for early horse domestication

We next investigated predictions of suitable climatic conditions around the time when horses were first domesticated. Between the proposed domestication centers (central Asia, eastern Anatolia, and Iberian Peninsula), the latter region remains very suitable for horses throughout the entire Holocene, while the Pontic-Caspian steppes show high values of p-Hor only up to 7 ka B.P. based on the European niche. The whole area surrounding Botai appears extremely suitable for Asian horses 5 ka B.P., in agreement with the archeological record supporting their domestication during the Eneolithic, but not for the European animals.


This study assembles and integrates independent and complementary lines of evidence across Europe and central-north Asia, as represented by the geographic and temporal distribution of dated fossil remains and their corresponding paleoecological conditions. Climate change and associated vegetation redistribution largely controlled the past distribution of horses.

Genomics evidence (15, 2427) has identified a minimum of three genetically distinct populations in Eurasia: an archaic group living in Taymyr/Yakutia between 43 and 5 ka B.P., the ancestors of Przewalski horses found in Kazakhstan 5 ka B.P., and the ancestor of modern domestic horses for which precise information on the geographic and chronological range is still missing.

On the basis of different data and methods to those in the mentioned genomic studies, our complementary approach also provides evidence of horses living under different climatic conditions throughout the past 44 ka, suggesting the possibility of locally adapted populations. Our projections of the potential range most suitable for the species based on Europe alone show poor overlap with areas where horse fossils have been found in Asia, and reciprocally. Moreover, considering pan-Eurasian data, we found less suitability for horses in southern Europe compared to forecasts based on European data alone. This suggests that assuming a single horse population adapted to a unique climatic niche for the whole continent, including the more temperate regions, leads to underestimating the horse adaptive range. This is also supported by the consistent difference in NPP between the areas inhabited by the horses in the two regions.

In line with the potential of horses to adapt to various environmental conditions, SDM analyses predict suitable climatic conditions for horses throughout most of the geographic area investigated, from the colder Holarctic zone to more temperate areas in southern Europe, up to tropical regions in Africa and southern Asia. SDM analyses based on Asian data alone fail to predict this large, cosmopolitan distribution, in particular in warmer areas.

These differences can potentially be explained by locally adapted individuals in different environments, for example, the horse lineage(s) present in Europe, and not Asia, could have tolerated a wider, and warmer, range of environmental conditions and provided the source population for the spread of horses in more temperate and/or tropical regions all across the Old World. Alternatively, Asian horses may have shown the same biological adaptations as European horses, but climatic and/or geographic boundaries precluded the further expansion of the species to temperate and tropical regions.

It is also possible that the differences in the occupied environments could at least partly result from dynamic interactions with competitors or predators, which prevented the Asian horses from occupying specific habitats. Evaluating paleoecological species interactions and implementing them in ecological modeling will further enhance our knowledge of the mechanisms shaping species niches and distributions.

Sampling bias against southern Asia, a region where fossil information is sparser, could potentially yield similar patterns, as it limited our capacity to capture more temperate conditions in the climatic envelope of the species. A better understanding of the ecology of the horse, and of the effect of sampling bias in the fossil record, will require extensive paleontological and archeological surveys within Asia, especially in regions encompassing temperate to tropical climate conditions within the past 44 ka.

Progress will also require genomic analyses of both European and Asian material to better define the population structure of Upper Paleolithic horses, and identify the genetic differences underlying potential adaptations in relation to various climate conditions. The recent sequencing of genomes of 38 ancient horses from Taymyr (27), Yakutia (25), and central Asian steppes (15, 24) represents the first steps toward this objective.

In Europe, we observe a decrease in the number of sites with horse finds during the LGM. At the same time, this period is associated with the lowest percentage of sites without horses (22%, 51 of 225). Because most data come from archeological sites, we argue that the observed decrease in the number of sites with horse finds is related to the decrease of human populations, whose distribution at the time was mainly limited to so-called refugia (28). Our analyses also help improve our understanding of the horse population dynamics after the climatic improvement of the Holocene. It has been shown that, in many cases, climatic change appears to modify the distribution of ungulate species, reducing their geographic range to specific climatic refugia (29). This is not the case for horses: After the climatic amelioration that started at the beginning of the Holocene, the drop in horse frequency observed in Europe takes place within a relatively steady geographical range.

A possible explanation for these findings is increasing adaptation of the species to closed habitats, as suggested by SDM analyses based on coarse vegetational reconstructions and temporal variation of genetic variants associated with coat color (18). Our macroenvironmental reconstruction appears consistent with this latter hypothesis because horse occurrences are increasingly found in macro areas considered semi-open or forested.

On the other hand, the decrease in horse frequency observed throughout the whole Holocene does not support this hypothesis, as also discussed in (19). Our data show that, up to 5.5 ka ago, horse finds do not show association with species characteristic of forested areas such as wild boar and roe deer. We infer that the open and semi-open habitats occupied by horses on a narrow geographic scale appear less and less frequent at a macroenvironmental scale, supporting the possibility of increasing fragmentation of open habitats. This event is also likely to have led to an intensification of genetic isolation for the remaining horse populations, a pattern that still needs to be tested on genomic data.

The suitability of both Iberia and eastern Europe appears constant throughout the entire post-LGM period, in line with these regions being hotspots of genetic diversity and, possibly, the refugia sources for the recolonization of the continent (11). While the Pontic-Caspian region appears not suitable for European horses around the time when horses where first domesticated some 5.5 ka ago (6), part of this region appears suitable for the Asian horses (with the Caspian Sea as the westernmost boundary). This may suggest that horse domestication started from a population background related to an Asian ancestry and that the further spread of the domesticated horses in Europe involved either adaptation to novel niches (possibly through selective breeding) or the application of domestication techniques to local horse populations pre-adapted to these environmental conditions. Testing this scenario will require mapping the genetic structure of the Eurasian horse population within the fifth to third millennium BCE. Further integration of a variety of lines of evidence, from genomic, paleontological, paleoenvironmental, and macroecological data and approaches, will provide deeper insights into the biogeography of animals, including how species react to climate change and domestication.



We created a database of horse presence through Eurasia collecting 3070 published radiocarbon dates, representing instances where E. ferus/Equus caballus remains (n = 816) or their archeological layers (n = 2254) have been directly dated. The whole database, with full reference to the original studies, is available in the Supplementary Materials as an Excel spreadsheet (table S1).

All dates were calibrated to calendar years B.P. with the online version of the software OxCal v.4.3 (30) using the IntCal13 curve (31). We used the average of the 94.5 distribution as a calibrated age determination of each fossil material. Each date was associated with the geographical coordinates of the archeological site, where the nearest location of the corresponding material was found, as provided in the original article.

To standardize the data set, we excluded duplicate 14C estimates, records without reported errors for radiocarbon dating, and dates for which the OxCal calibration resulted in an error. To reduce the sampling bias associated to the uneven density of investigations through time across Eurasia, we aggregated all observations within the same time bin and geographic location (defined as the grid cell of the paleoclimatic reconstructions, that is, 1° × 1°) to a single occurrence. This resulted in a data set counting 1019 observations for Europe and 321 for Asia.

The analyses were performed on three different geographic partitions of the data set: the whole Eurasia (that is, the complete database), Europe (defined as west of 60°E and north of 37°N), or Asia (east of 60°E) (see Fig. 1A). Moreover, we collected information on the NISP for horses and their density compared to other ungulates for 1120 archeological layers in Europe (also available in table S2).

Climate and vegetation data for the past

To gain a better knowledge of the horse paleoecology, we needed high-resolution climate data from the past. Currently, no such comprehensive data set is available and existing paleoclimate reconstructions are either rather sparse in time, sparse in space, or both, and climate models are the best candidates to fill those gaps where information is missing. One such model that provides several “snapshot” simulations of the past 120 ka is HadCM3 (20). HadCM3 is a comprehensive global climate model that consists of an atmospheric circulation model, an ocean circulation model, and a sea ice model. HadCM3 has a rather coarse resolution, giving that the atmosphere model, for example, has a horizontal resolution of 2.5° in latitude and 3.75° in longitude. There is, of course, a trade-off between model resolution and available computational resources to run climate model simulations effectively throughout the geological past.

For our period under consideration, 44 to 3 ka B.P., the time increment of these HadCM3 “snapshots” simulations is every 1 ka from 3 to 22 ka B.P. and every 2 ka from 22 to 44 ka B.P. The climate drivers for our model are minimum temperature (tmin), maximum temperature (tmax), annual mean net primary productivity (NPP), and total annual precipitation (totprec). NPP is not directly available from the snapshot simulations and has been calculated using the vegetation model BIOME4 (32).

BIOME4 is a coupled biogeography and biogeochemistry model that simulates the equilibrium vegetation types and NPP. BIOME4 simulates the competition between different plant functional types (PFTs), such as grass, temperate or evergreen forests, or shrubs. Using empirical rules, BIOME4 then assigned biomes based on the most successful and second most successful PFT and their leaf area index. As input, BIOME4 needs latitude to calculate incoming short-wave radiation, atmospheric CO2 concentration, monthly mean precipitation, near-surface air temperature, and cloud cover. These data can be used from the HadCM3 model output for the different periods from 44 to 2 ka B.P. Furthermore, BIOME4 requires a description of several soil physical properties such as water holding capacity and percolation rate. Because the latter two are not available for the past, present-day data were used.

To create a high-resolution data set that is in line with present-day observations, the coarse-resolution climate model output was downscaled and bias-corrected using the so-called delta method, which works as follows: the mean climate change signal, that is, past snapshot minus present-day snapshot, is added to the observations (hence its name delta method). The “delta” is given by the difference of the past and the present climate in the coarse resolution. It is then interpolated to the high resolution and then added to the high-resolution present data. The present-day observational data set that has been used for the bias correction is the Ten Minute Climatology (33). It is available on a 10–arc min grid as the name suggests. The climatology includes monthly mean values of precipitation, temperature, relative humidity, sunshine duration, and some more. The spatial downscaling and bias correction function, that is, a bilinear interpolation from low resolution to high resolution, can be summarized, for example, for temperature T, as f followingTpasthires=fhires(ΔTpastpresentlores)+Tpresenthires

The spatial downscaling and bias correction was applied to the HadCM3 climate variables and yielded minimum temperature, maximum temperature, and annual precipitation on a 10–arc min horizontal resolution, that is, 2160 × 1080 data points. Similarly, NPP was calculated by BIOME4, which was driven by the previously mentioned climatic variables at their higher resolution.

Whether a coordinate is a land point or covered by an ice sheet during a specific period in the past is also derived from the HadCM3 model. Here, only the spatial downscaling to the high resolution was applied using bilinear interpolation (threshold = 0.25).

The extension of the Black Sea through time was assessed by subtracting the estimate of global sea-level changes from the present-day topography. This method, however, could not be applied to the Caspian Sea, which, as an inland lake, does not respond to rising and sinking sea levels. We thus opted for a conservative approach using the present-day extent for all past reconstructions, including during colder ice age–related time periods when the amount of liquid water available in the region was more limited.

Paleoecology of the horse

Each radiocarbon date available was associated to its nearest paleoclimatic reconstruction, and the corresponding environmental variables and biome/vegetation coverage estimation for the location. For most analyses, the data were divided into eight climatic periods: pre-LGM (older than 27 ka B.P.), LGM (27 to 18 ka B.P.), Late Glacial (18 to 11.7 ka B.P.), Preboreal (11.7 to 10.6 ka B.P.), Boreal (10.6 to 9.1 ka B.P.), Early Atlantic (9.1 to 7.5 ka B.P.), Late Atlantic (7.5 to 5.5 ka B.P.), and Recent (younger than 5.5 ka B.P.).

We visualized the overlap between the observed climatic niches in Asia and Europe using a PCA performed in R (function princomp). The climatic conditions in areas occupied by horses were plotted against a reference set of 100 points randomly selected in each time slice in Asia and Europe to monitor habitat choice through space and time. We compared through time NPP estimates for localities where horses have been found in Europe and Asia.

To complement the exploratory analysis of the ecological niche, we quantified the niche overlap of the two subpopulations (European and Asian) in environmental space using Schoener’s D (34), as modified by Broennimann and colleagues (22). This approach used a kernel density function to reduce the effects of the uneven number of occurrences between ranges and the uneven distribution of environmental conditions between ranges or time periods. This framework was applied to invasive species (35) or to assess niche stability or similarity between time periods (36, 37). The overlap (D) ranges between 0 (indicating no overlap) and 1 (indicating that the two niches are identical). Niche overlap (D) was calculated using the package “ecospat” in R (38).

To estimate D, we first constructed a total environmental space for each continent by combining the environmental variables (tmax, tmin, NPP, and totprec) across all time intervals (between 44 and 3 ka B.P.). Subsequently, we summarized it in two dimensions using a PCA. In this PCA space, we projected the Equus occurrences from all time intervals for each area and then calculated the overlap. As no occurrences were available for Asian horses in the 9 ka B.P. time interval, we excluded the conditions from the total environmental space.

In addition, to further assess the extent to which the bias present in the fossil record could affect our niche estimates, in particular the uneven sampling between Europe and Asia, we randomly subsampled the horse occurrences of Europe in each time interval to match the number of available occurrences in the Asian database. This resulted in having an equal amount of records data between the two regions for each and across all time intervals. We then calculated the D overlap between the niche of European and Asiatic horses, and the whole process was repeated 100 times. The biomes estimated for the macro areas occupied by horses were subdivided in open (grassland, tundra), semi-open (taiga, shrubland), and closed (forest, woodland) environments.

Species distribution modeling

We used SDM to predict the geographical distribution of the horse climatic niche based on horse occurrences through time and the four environmental variables considered. Three different data sets were used: the whole Eurasia, Europe alone, and Asia alone. We ran the analyses as implemented in the R package biomod2 ( for each time slice, where the number of occurrences was at least 10 (Fig. 1), following previous work showing that this minimal threshold resulted in high accuracy (90%) (39).

For analyses carried out at the scale of Europe and Asia, we simply followed the procedure described below. The analyses were carried out at the scale of Eurasia required to account for the potential bias resulting from the significant difference in the number of occurrences in Europe and Asia. To achieve this, we repeated the same procedure 100 times, considering 100 random resampling of European occurrences to match the exact number of records available for the Asian part of our study area. We limited the analyses to the time slices where at least five data points were available for Asia, to work with a minimum of 10 occurrences for the whole Eurasia, as discussed above.

For each time interval, occurrence data were integrated with an equal number of pseudo-absences sampled from the whole area randomly (40). Locations for pseudo-absences were randomly and independently selected five times, replicates, to avoid biases associated to specific geographic configurations.

Modeling was then repeated five times for each of the five replicates and for all 10 algorithms available in biomod2. The algorithms belong to four main classes: regression, machine-learning, classification, and entropy. We used 80% of all localities (occurrences and pseudo-absences) to calibrate the models and 20% to evaluate their predictive accuracy based on the true skill statistic (TSS; threshold = 0.8) (41). All simulations implemented with the generalized boosting method performed very poorly (TSS < 0.5) and were consequently disregarded.

To achieve more robust predictions (42), for Europe and Asia, we averaged all the results obtained for each data set and time bin using the following procedure within the ensemble forecasting function from biomod2. First, we created 10 different ensembles: 9 merging all the runs from each algorithm (= “algorithm ensembles”) with TSS > 0.8, and 1 merging all available runs from all algorithms (= “full ensemble”) with TSS > 0.8. We compared the results considering three statistics associated with each ensemble (mean, median, and weighted mean) (figs. S6 to S8). In both regions, no single ensemble was found to outperform the others, but the full ensemble was the only one being consistently among the best. Among the statistics, the weighted mean tended to show higher TSS values and was then used to draw the maps. This evaluation would have been too computationally intense for Eurasia, given the number of repetitions involved, so, in this case, we directly considered the full ensemble.

We projected the climatic niche of the horse obtained from the three regions considered in Eurasia and northern Africa using only runs with TSS > 0.8. This provided a map of the whole region for each time bin, depicting for each grid cell the level of climatic suitability for horses in a scale from 0 to 1 (hereafter referred to as p-Hor) (figs. S3 to S5).

The procedure described was followed independently for each time bin. For Eurasia, we merged the results of the 100 resampling by adding the individual results and normalizing the sum to get a value comprised between 0 and 1. To combine different time bins (for example, for each climatic period; Fig. 4, A and B), we calculated the sum of the relevant time bins and normalized it to get a value comprised between 0 and 1.


Supplementary material for this article is available at

Fig. S1. PCA based on faunal assemblages from Holocene sites, PC1 (38.5%) versus PC2 (19.2%).

Fig. S2. PCA based on faunal assemblages from Holocene sites, PC2 (19.2%) versus PC4 (11.8%).

Fig. S3. Projections of the horse climatic niche for each time bin based on occurrences in Europe.

Fig. S4. Projections of the horse climatic niche for each time bin based on occurrences in Asia.

Fig. S5. Projections of the horse climatic niche for each time bin based on occurrences in Eurasia.

Fig. S6. Evaluation of the ensemble models for Europe (44 to 14 ka B.P.).

Fig. S7. Evaluation of the ensemble models for Europe (14 to 2 ka B.P.).

Fig. S8. Evaluation of the ensemble models for Asia.

Table S1. Database of radiocarbon dates associated with horse remains and with archaeological layers where horses have been excavated in Europe and Asia.

Table S2. Database of horse frequency with respect to other ungulates in archaeological layers from European sites.

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We would like to thank T. Stuart and A. Lister for giving us access to their unpublished database of radiocarbon dates and A. Outram, C. Luis, and E. Lorenzen for their help in gathering radiocarbon dates and completing the original list of paleontological sites, where horses were identified. We are grateful to R. Garcia for technical support with biomod2. We thank A. Pacifici and V. Aschonitis for their useful comments. We are grateful to the editor and two anonymous reviewers for their comments that improved the quality of this manuscript. Funding: M.L. was supported by a Marie-Curie Individual Fellowship (MSCA-IF-657852). This work was supported by the Danish National Research Foundation (grant no. DNRF94); Initiative d’Excellence Chaires d’attractivité, Université de Toulouse (OURASI); and the Villum Fonden miGENEPI research project. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 681605). K.G. and D.N.-B. thank Sapere Aude EliteForsk, DFF (Det Frie Forskningsråd), and the Danish National Research Foundation (DNRF96). A.M., M.K., and R.M.B. were supported by the European Research Council Consolidator grant 647787-LocalAdaptation. Author contributions: M.L., F.B., K.G., R.S., P.B., R.B., and L.O. collected radiocarbon dates. F.B. and P.B. collected and analyzed frequency data and faunal association data. A.M., R.M.B., and M.K. provided paleoclimatic reconstructions. K.G. calculated the niche overlap. M.L. performed all other analyses under the supervision of L.O., D.N.-B., and A.M. and with input from K.G., F.B., and P.B. M.L., K.G., D.N.-B., A.M., and L.O. wrote the manuscript. All authors revised the manuscript. M.L. prepared all figures in the main text. M.L. and F.B. prepared the figures in the Supplementary Materials. Competing interests: The authors declare that they have no competing interests. Data and materials availability: The whole database is available as Excel spreadsheet in the Supplementary Materials. All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.

Stay Connected to Science Advances

Navigate This Article