Research ArticlePALEONTOLOGY

Deep drilling reveals massive shifts in evolutionary dynamics after formation of ancient ecosystem

See allHide authors and affiliations

Science Advances  30 Sep 2020:
Vol. 6, no. 40, eabb2943
DOI: 10.1126/sciadv.abb2943


The scarcity of high-resolution empirical data directly tracking diversity over time limits our understanding of speciation and extinction dynamics and the drivers of rate changes. Here, we analyze a continuous species-level fossil record of endemic diatoms from ancient Lake Ohrid, along with environmental and climate indicator time series since lake formation 1.36 million years (Ma) ago. We show that speciation and extinction rates nearly simultaneously decreased in the environmentally dynamic phase after ecosystem formation and stabilized after deep-water conditions established in Lake Ohrid. As the lake deepens, we also see a switch in the macroevolutionary trade-off, resulting in a transition from a volatile assemblage of short-lived endemic species to a stable community of long-lived species. Our results emphasize the importance of the interplay between environmental/climate change, ecosystem stability, and environmental limits to diversity for diversification processes. The study also provides a new understanding of evolutionary dynamics in long-lived ecosystems.


One of the fundamental questions in biology is what determines species diversity (1). Macroevolutionary theory defines diversification as the interplay between speciation and extinction (2, 3). For isolated ecosystems, conceptual models assume that this interplay and the accumulation of in situ species diversity over time typically lead to a flattening of the species accumulation curve and a (dynamic) equilibrium diversity [e.g., the general dynamic model of island biogeography (4), the unified neutral theory of biodiversity and biogeography (5), the concept of ecological opportunity (6), and the concept of equilibrium speciation dynamics (7)].

These conceptual diversification models typically presume increasing competition, decreasing range and population sizes, and/or decreasing niche availability with increasing species richness (4, 8). However, because of the scarcity of empirical data directly tracking biodiversity and environmental change over time, there is a limited understanding of actual speciation and extinction trajectories and the biotic and abiotic drivers of diversification. There are three empirical challenges. First, only few study systems exist that have persisted over evolutionarily relevant time scales and that harbor a sufficient number of endemic species to enable analysis of in situ diversification processes. Second, uneven preservation within the fossil record (9) and inherent difficulties in quantifying extinction from molecular phylogenies (10, 11) hamper precise estimates of speciation and extinction rates. Third, reliably dated local environmental time series covering the entire history of an isolated ecosystem are scarce, making it difficult to infer the drivers of evolutionary dynamics.

Lake Ohrid, Europe’s oldest and most species-rich freshwater lake (12), meets these challenges. The lake (Fig. 1, A and B) has existed continuously for 1.36 million years (Ma) (Fig. 2, A and B) (13). With 201 extant and extinct endemic species, diatoms (Bacillariophyta) are the most biodiverse group in the lake. They belong to the two subphyla Coscinodiscophytina (17 species) and Bacillariophytina (184 species), which mainly represent planktonic and benthic lifestyles in Lake Ohrid, respectively. Both groups are paraphyletic in the lake, and there is preliminary evidence of both anagenetic and cladogenetic speciation (14). The vast majority of the endemic benthic species have small population sizes and are restricted to the southern and eastern parts of the lake (15). As diatoms are autotrophic organisms, they only inhabit the photic zone, which reaches to a depth of about 80 m in Lake Ohrid (12). The robust silica shells (“frustules”) of these primary producers may form microfossil successions, making diatoms a key taxon for paleoenvironmental reconstructions from lake sediments and for unraveling evolutionary dynamics through time (16).

Fig. 1 Location and bathymetry of Lake Ohrid.

(A) Location of Lake Ohrid on the Balkan Peninsula. (B) Digital elevation map of the lake and surrounding area, with the location of the drilling site (DEEP). (C) Photograph of the deep lake drilling system (DLDS). Photo credit: Thomas Wilke, Justus Liebig University Giessen.

Fig. 2 Climate and environmental indicators and endemic diversity trajectories in Lake Ohrid spanning the past 1.36 Ma.

Black dashed line, time of lake formation; blue bar, lake phases (shallow and deep) (17); brown bars, glacial periods. For a full list of indicators used, see fig. S3. (A) DEEP site grain sizes (summarized by ordination) indicating widening of the lake 1.36 to 1.15 Ma ago. Brown curve, time-averaged data; gray curve, raw data. (B) Lake Ohrid δ18Olakewater values indicating deepening of the lake 1.36 to 1.15 Ma ago. Blue curve, time-averaged data; gray curve, raw data. (C) Lake Ohrid species accumulation curves for endemic diatom species (see also fig. S2). Green species accumulation curve, data corrected for preservation bias; green shading, 95% credible interval; gray curve, raw data (sampled standing diversity). SEM image shows the common endemic species Aneumastus macedonicus. Scale bar, 10 μm. (D) Standardized effect size (SES) of mean pairwise taxonomic distance of all endemic diatom species; values below orange dashed line indicate a significant taxonomic similarity of the community. (E) Estimated mean longevity of endemic diatoms and 95% confidence interval (purple shading), and literature values for the average longevities of the two worldwide freshwater diatom groups with distinct longevities (22) (gray area within the purple dashed lines). (F) Per-lineage speciation rate and 95% confidence interval (blue shading). Blue dashed lines, shifts in speciation rate 1.24 [95% highest probability density (HPD), 1.26 to 1.16] and 1.05 (1.16 to 0.46) Ma ago. SP, speciation rate phases. SEM image shows the extant Scoliodiscus glaber (scale bar, 10 μm), which originated during the early lake phase. (G) Per-lineage extinction rate and 95% confidence interval (red shading). Red dashed line, shift in extinction rate 1.19 (1.26 to 1.16) Ma ago. EP, extinction rate phases. SEM image shows the extinct Cyclotella cavitata. Scale bar, 10 μm.

A continuous sediment succession was obtained from Lake Ohrid in 2013 as part of the International Continental Scientific Drilling Program (ICDP; Fig. 1C). The upper 447 m of the composite record comprise lacustrine sediments that were continuously deposited since lake formation 1.36 Ma ago (13). They cover both the environmentally dynamic shallow-water (1.36 to 1.15 Ma ago) and environmentally more stable deep-water phases (1.15 to 0 Ma ago) (13, 17). Core processing resulted in uninterrupted environmental and climate indicator time series and a unique diatom species record at a sample resolution of around 2000 to 4000 years (fig. S1 and data S1). With a total of 152 of 201 endemic diatom species recovered, the sequence comprises 75.6% of all endemic diatom species ever found in the lake.

Here, we assess the speciation and extinction dynamics of Lake Ohrid’s endemic diatom assemblage since the formation of this ancient ecosystem 1.36 Ma ago, and offer an explanation for the temporal changes in diversification rates and the underlying drivers of evolutionary dynamics.


Speciation and extinction dynamics

Our diatom fossil time series shows that species richness increased asymptotically after lake formation, reaching the highest standing diversity of 123 endemic species after 0.73 Ma (Fig. 2C and fig. S2). However, albeit very close, the diatom community appears not to have yet reached an environmentally defined ecological limit to species diversity, a dynamic equilibrium (fig. S3O). We found a significant phylogenetic signal (18) (D = 0.37; fig. S4) for selective extinction of endemic species within specific taxonomic groups, mainly in the planktonic subphylum Coscinodiscophytina. The mean pairwise taxonomic distance of the endemic diatom assemblage decreased over time below what is expected by chance (Fig. 2D).

High per-lineage speciation and extinction rates after lake formation declined rapidly within a few hundred thousand years (Fig. 2, F and G). This change was associated with an increase in the taxon longevity—the time from origination to extinction of a taxon—calculated as the reciprocal of the extinction rate (19). The longevity reached its highest values approximately 1.00 to 0.65 Ma ago (Fig. 2E). Species birth-death modeling indicates that two significant decreases in speciation rate (1.24 and 1.05 Ma ago) and one decrease in extinction rate (1.19 Ma ago) best describe the diversification trajectories (Fig. 2, F and G). Therefore, we recognize three phases in the speciation rate (SP1 to SP3) and two in the extinction rate (EP1 and EP2; Fig. 2, F and G). The age and number of shifts identified were robust against incomplete taxon sampling (fig. S5) and potential violations of the birth-death model because of multiple colonization of the lake.

Drivers of rate change

Our models, which were used to link speciation and extinction dynamics to environmental, climate, and species richness covariates (Fig. 3 and fig. S6), revealed that the overall temporal changes in the speciation rate are strongly affected by diversity dependence, climate, and local environmental parameters (fig. S6A). The strongest covariate was an environmentally defined ecological limit to species diversity, which was dynamically shaped by parameters related to lake size (differences in Bayes factor >5 to the second best model; fig. S3O). Analyzing the drivers of speciation for the three speciation periods separately, we found a shift over time from environmental to diversity dependence (Fig. 3A and fig. S6, B to D). In contrast, only parameters related to lake size exerted a strong influence on the extinction rate (Fig. 3B and fig. S6E).

Fig. 3 Summary statistics of the correlation strength of speciation and extinction rates with climate, environmental, and diversity indicators.

Plots summarize the results of the diversification analyses of covariate influence over the entire period of 1.36 Ma and within the three speciation (A) and two extinction rate periods (B). Individual indicators were color coded according to the indicator group. The letter sizes of the indicators correspond to the difference in the Bayes factor K to the best-fit model. For the environmental and climate indicators, both the influence of the change in the parameter and that of the total value were tested; however, only the best is displayed in each case. This also applies to the indicators for the environmentally defined ecological limit to species diversity, where three parameters related to lake size have been tested, but only the best one is shown (for detailed results, see fig. S6). The results show a shift over time from environmental to diversity dependence of the speciation rate (A) and a continued influence of lake size–related parameters on the extinction rate (B).


Evolutionary rates simultaneously decrease with lake deepening

Our data show that the pattern of endemic species accumulation in Lake Ohrid is consistent with conceptual diversification models applicable to insular ecosystems, displaying a deceleration of diversity build-up over time and with increasing diversity (4, 5, 8). However, we found that the per-lineage extinction rate decreased during the early phase of Lake Ohrid (Fig. 2G), as opposed to an increasing extinction rate suggested by these models.

We propose that the ample ecological opportunity (3, 6) encountered by the diatom species that colonized the emerging lake 1.36 Ma ago facilitated diversification and led to an initially high speciation rate (Figs. 2F and 3). Simultaneously, the relatively small habitable space together with high environmental dynamics (17) (Fig. 2, A and B) likely caused the frequent extinction of endemic species (Fig. 2G). While the lake expanded and deepened (Fig. 2, A and B), the high speciation rate declined, probably because of an increasingly negative effect of diversity dependence when species richness approached the lake size–defined ecological limit to diversity (fig. S6B).

The observed decline in the speciation rate was accompanied by a time-lagged decrease in the extinction rate (Fig. 2, F and G). The latter was likely due to increasing environmental and microclimate buffering sensu (20, 21) in the deepening lake. Moreover, the rapid expansion of habitable space (Fig. 2, A and B) likely supported larger population sizes, as suggested by the negative effect of lake size indicators on the extinction rate (fig. S6E). The expected longevity of endemic diatom species during the shallow-water phase of Lake Ohrid was comparatively low (22) (Fig. 2E) but increased rapidly with lake expansion. Although the longevity varied after the deepening of the lake, these changes were not significant, as indicated by the lack of extinction rate shifts during the past 1.2 Ma (Fig. 2G), and values largely remained within the range reported for freshwater diatoms worldwide (22) (Fig. 2E). This is probably due to Lake Ohrid’s persistent environmental and microclimate buffering during deep-water conditions, which mitigated changes in the extinction rate and thus in the species longevity (Fig. 3).

From volatility to stability

The trajectories of evolutionary rates in Lake Ohrid diatoms point to a macroevolutionary trade-off, that is, speciation and extinction rates are not independent and therefore covary loosely (23). Moreover, as the lake deepens, we see a switch in the macroevolutionary trade-off from “increaser” taxa—constituent species with high speciation rates and short longevities—to “survivor” taxa—species with low speciation rates and high longevities sensu (23). While the two types may represent equivalent strategies with theoretically similar extinction risks (23), increaser taxa are likely more susceptible to stochastic effects and external perturbations, making these more volatile species more likely to disappear (23). This assumption of a switch in the macroevolutionary trade-off in Lake Ohrid diatoms is supported by the high initial speciation and extinction rates (Fig. 2, F and G), the nonindependence of these rates over time [see the parallel rate trajectories in Fig. 2 (F and G)], the selective extinction within some clades (fig. S4), and the decreasing environmental dependence of the speciation rate (Fig. 3). Moreover, the mean pairwise taxonomic distance of the endemic diatom assemblage decreased over time (Fig. 2D).

We also observed a decreasing invasibility of Lake Ohrid by nonendemic species, which is reflected in the decreasing immigration rate (fig. S7). Together with a decreasing (local) extinction rate, this resulted in a longer persistence of nonendemic diatoms in the lake. These findings indicate an increasing coexistence of constituent taxa, leading to a rich and stable community sensu (24).

Combining the results from our diversification rate and environmental correlate analyses (Fig. 3), we hypothesize that a volatile assemblage of short-lived endemic diatom species, which encountered ecological opportunity in the developing ecosystem, evolved into a stable community of long-lived endemic species after the lake had achieved long-term stability. These findings also emphasize the important role of (i) ecosystem buffering for mitigating the effects of environmental and climate dynamics on diversification processes and (ii) environmentally defined ecological limits to diversity in isolated ecosystems.

Not all taxa responded equally

Although our analyses showed an overall decreasing extinction rate in the diatom community of Lake Ohrid over time, the observed selective extinction within the small planktonic subphylum Coscinodiscophytina (fig. S4), for example, indicates the high evolutionary volatility of this clade. Most members prefer nutrient-rich conditions, which we assume for the shallow-water phase of the lake (17) (see also fig. S3H). As the lake deepened, and conditions became more oligotrophic (17), many species of the Coscinodiscophytina became extinct, probably because they were unable to withstand reduced nutrient levels. In contrast, the subphylum Bacillariophytina continued to proliferate, resulting in an outstanding diversity of 184 extant endemic species.

We do not know the reasons for the relative volatility of the Coscinodiscophytina in Lake Ohrid, and, with only 17 species, the species number of Coscinodiscophytina is too small to allow meaningful analyses of diversification rates and environmental correlates. Nevertheless, we put forward the hypothesis of trait-based selective extinction in this high-nutrient-dependent “increaser” clade.

The value of high-resolution empirical data

We provide here an unprecedented view of in situ diversification in an ancient ecosystem and show that short-term environmental changes during the early phase of ecosystem formation have strong effects on macroevolutionary dynamics and the long-term buildup of diversity.

Our results therefore emphasize the importance of high-resolution fossil and environmental time series for a comprehensive assessment of speciation and extinction dynamics (11). Because the temporal resolution of our study, particularly with respect to extinction rates, is not yet matched by molecular phylogenies, our findings provide a unique contribution to understanding of temporal evolutionary dynamics in long-lived ecosystems. They may also serve as a baseline for future explorations of adaptive radiations, which often arise from ecological opportunity (3, 6).


Lake properties

Lake Ohrid (Republic of Albania/Republic of North Macedonia; 40°54′ to 41°10′ N, 20°37′ to 20°48′ E; 693 m above sea level) is a tectonically formed oligotrophic lake with a tub-shaped morphology. It is approximately 30 km long, 15 km wide, and has a maximum water depth of 293 m (25) (Fig. 1). Water input comes from sublacustrine karst springs (64%) and from direct precipitation and river inflow (36%) (26).

The Ohrid Basin was formed by transtension during the Miocene and opened during the Pliocene and Pleistocene (27). Peat, slack water, shallow lacustrine, and fluvial sediments were deposited until 1.36 Ma ago (27), when continuous hemipelagic sedimentation started (i.e., the formation of modern Lake Ohrid) (13). A shift to higher lakewater oxygen isotope values (δ18Olakewater) and a decrease in grain size 1.36 to 1.15 Ma ago (Fig. 2, A and B) indicate deepening and widening during an early extensional phase of lake basin development (17, 25). The absence of truncations in seismic reflection data (25) together with the absence of shallow-water deposits in the drill core records (27) suggest that no desiccation has occurred since lake formation 1.36 Ma ago.

Owing to the narrow shape of the Ohrid Basin, the lake fills almost the entire basin today, and watershed size is comparably small (Fig. 1). The relative isolation of the lake in combination with its high age led to a considerable degree of in situ speciation in diatoms (14). A similar level of endemic diatom diversity is only found in Lake Baikal (28), which, however, occupies an area two orders of magnitude larger than Lake Ohrid.

Sediment cores

Sediment cores from the Lake Ohrid DEEP site (41°02′57′′ N, 20°42′54′′ E, 243-m water depth; Fig. 1B) were recovered in spring 2013, using ICDP’s deep lake drilling system (DLDS) operated by Drilling, Observation, and Sampling of the Earth’s Continental Crust (DOSECC). The composite sediment record is based on six parallel boreholes with a terminal depth of 568 m. Sediment recovery from the succession spanning the continuous lacustrine phase of the lake [1.36 to 0 Ma ago, corresponding to 0 to 446.65 m composite depth (m c.d.)] is 99.8% (13).


The chronology for the current analyses (1.36 to 0 Ma ago) was published previously (13) and uses tephrochronological data as first-order tie points and tuning of climate-sensitive proxy data against orbital parameters as second-order tie points. Tie points were cross-evaluated by two paleomagnetic age constraints (base of the Jaramillo sub-Chron and Matuyama/Brunhes boundary).

Indicator time series

Grain size analysis was performed on 64-cm-spaced samples following the procedure described previously (29). Data processing was carried out by using the GRADISTATv8 program (30). Grain size variations in undisturbed hemipelagic sediment successions can be used to directly infer changes in transport energy, with coarser grain sizes representing higher transport energy levels. Grain size data are available as data S2.

Oxygen isotopes (δ18O) were measured on calcite every 16 cm through zones of higher (>0.5%) total inorganic carbon (TIC) and on siderite within zones of overall low TIC. The dataset was supplemented with results from the Lini site in Lake Ohrid (core Co1262) (31). Carbonate species identification, mineralogical assessment, and analyses were carried out as described previously (32). Results are reported in delta (δ) notation in per mil (‰) versus the Vienna Pee Dee Belemnite scale using a within-run laboratory standard calibrated against NIST Standard Reference Materials (NBS 18 and NBS 19). Analytical reproducibility for the within-run standard was <0.1‰ (±1σ) for δ18O. To compare between the different carbonates, calcite and siderite δ18O were converted to δ18Olakewater as described previously (32). In isotopically evaporated lake systems, such as Lake Ohrid, δ18Olakewater primarily reflects a balance between precipitation (lower δ18O) and evaporation (higher δ18O) that is modulated at the orbital scale by basin development. Deeper lake water conditions drive higher δ18Olakewater due to extended water residence times and increasing surface area enhancing the influence of evaporative forcing. δ18Olakewater data are available as data S3.

All other indicator time series from Lake Ohrid and surrounding areas [percentages of arboreal pollen excluding Pinus pollen, percentage of pollen from deciduous oaks, TIC concentrations, total organic carbon concentrations, potassium (K) and calcium (Ca) counts from x-ray fluorescence scanning, and relative sedimentary quartz content] were taken from a previously published study of the Lake Ohrid drill core (fig. S3) (13).

Global indicator time series, that is, Medstack δ18O planktonic Foraminifera isotope ratios (33) and LR04 benthic Foraminifera δ18O stack isotope ratios (34), were obtained from the literature. Northern Hemisphere summer insolation at the latitude of Lake Ohrid (41° N) was calculated as described previously (35).

Trends in all univariate local and regional indicator time series (that is, the signal separated from stochastic noise) were recovered by following the modeling protocol for paleoecological time series (36). Because of the size of the dataset, the “bam” implementation of the recommended generalized additive models (GAM) was applied using the package “mgcv” 1.8-22 (37) for the R 3.6-1 statistical programming language (38). The autoregressive option accounted for the temporal autocorrelation in the indicators’ time series. Because of the multivariate nature of the grain size composition, a redundancy analysis (RDA) was performed with polynomials of time as predictors of isometric logarithmic ratio transformed values by using the R package “composition” 1.40.1 (39) and “vegan” 2.4.4 (40).

Changes of climate and environmental parameters per 1000 years (Δ) were quantified on the basis of the first derivative of the smoothing splines (GAM), the polynomials (RDA), and the linearly interpolated global indicators.

Diatom processing

Diatoms (kingdom Protista, phylum Bacillariophyta) were selected as a model taxon because they play a key role in the Lake Ohrid’s trophic web and are the most species-rich eukaryotic taxon in the lake (28). Their robust silica frustules are commonly well preserved in Lake Ohrid, resulting in an outstanding fossil record over evolutionary time. Moreover, the characteristic ornamentation of the frustules typically allows determination of Lake Ohrid taxa down to species level. Last, their average species longevity in the fossil record is, in general, comparable to that of Metazoa (41). In lacustrine systems, diatoms are therefore an unparalleled resource for evolutionary research (16).

Three hundred eighty diatom samples were analyzed from the DEEP sediment succession, taken every 128 cm (successive samples are around 2000 to 4000 years apart) between 0 and 406.96 m c.d., and every 64 cm (around 2000 years) between 406.96 and 447.28 m c.d. The higher sampling frequency in the early phase of Lake Ohrid was chosen because this period was environmentally very dynamic (13, 17). For comparison, we also analyzed 124 samples from the phase before 1.36 Ma ago (447.28 to 83.92 m c.d.). Sediment samples were acid cleaned as described previously (42). Residues were mounted with Naphrax (Brunel Microscopes Ltd.) on glass slides and analyzed by transmitted light microscopy. With a random subset of samples, a rarefaction analysis was performed with the R package “vegan” to determine the number of individuals per slide to be counted for a robust species richness estimation of endemic and nonendemic species (i.e., 300 individuals). To account for very rare endemic species in the diversification analyses, the remaining valves on the slides were screened for endemic taxa only.

Counting was carried out by two scientists (E.J. and A.C.) trained in the same laboratory (Z.L.) and following the same diatom identification protocol (developed by Z.L., E.J., and A.C.). A “taxonomy working group” solved taxonomic questions, and “blind sample analyses” were implemented as quality control procedure (43). As a conservative species delimitation approach, 28 morphotypes (i.e., entities showing a highly variable morphology and occurring for <0.5 Ma) were combined to 12 nominal taxa, owing to ecophenotypic plasticity in some planktonic diatom species (44, 45) [see fig. S1 (B and D to G)].

Visual inspection of diatom specimens showed that even specimens from the early lake phase were generally well preserved across taxonomic groups. Although there was some degree of dissolution, especially during the glacial periods, specimens were typically identifiable down to the species level (see the scanning electron micrographs of planktonic and benthic diatoms in fig. S1).

To distinguish between endemic and nonendemic species, the Ohrid diatom record was compared with (i) the extant diatom flora in the surroundings of Lake Ohrid (46); (ii) Late Miocene-Pleistocene Balkan freshwater diatomites from the Prespa, Piskupstina, Mariovo, Kichevo, Delchevo-Pehchevo, Berovo, Pelagonia, Metohia, and Sofia basins (47, 48); (iii) fossils from the Lake Prespa cores Co1204 (49) and Co1215 (50), the Lake Dojran core Co1260 (51), and the Lake Ohrid cores Co1202 and Lz1120 (17, 44) (fig. S8); and (iv) Pantocsek’s diatom collection at the Hungarian Natural History Museum in Budapest on Neogene fossil deposits in Romania and the Hustedt diatom collection at the Alfred Wegener Institute in Bremerhaven, Germany.

The hierarchical taxonomic classification of diatoms into subphylum, class, subclass, order, family, and genus largely follows AlgaeBase (52). This information was converted into a systematic tree with the R package “ape” 5.0 (53), where the difference between taxonomic ranks equals a branch length of one unit (fig. S4).

Statistical analysis

A total of 201 endemic diatom species have been documented from Lake Ohrid, 169 extant and 32 extinct species (see data S1). Of these, 152 (120 extant and 32 extinct) species were obtained from the DEEP sediment succession. With a retrieval rate of 75.6%, the fossil record used for the current analyses is representative. It covers all taxonomic groups and lifestyles (benthic and planktonic species; data S1). In addition to the 152 endemic species, 218 nonendemic species were found in the sediment succession, 16 before and 202 after lake formation. The latter information was used for the modeling of diversity dependence and the equilibrium diversity analysis.

To take into account the temporal uncertainty of speciation and extinction events due to preservation biases, we used the software package PyRate (54) to obtain a set of potential times for the speciation and extinction of each endemic species. Differences in the frequency of species occurrences due to, for example, differences in population sizes were accounted for by a gamma-distributed sampling rate discretized in four categories (54). Moreover, the frequency of species’ occurrence in the current record may change over time due to ecosystem change altering the preservation potential or due to a more intensive screening during the early lake phase. Therefore, we defined a total of three boundaries between four periods in which the mean sampling rate may differ according to major changes in δ18Olakewater and sampling intensity: 1.150, 0.866, and 0.455 Ma ago. Comparing preservation models (54) using the corrected Akaike information criterion (AICc) suggested a significant better fit to the fossil data of the time-variable Poisson process for fossil sampling than a sampling rate that is either constant (ΔAICc = 72.1) or dependent on the life span of a species (ΔAICc = 425.2). We incorporated age uncertainties for the sediment succession (13) in the inference of speciation and extinction times by generating 100 randomized inputs for the birth-death sampling (BDS) analyses with ages for the 380 diatom samples drawn from a uniform distribution of the mean age of the sample ± the age uncertainty of approximately 2000 years. As described previously (55), the BDS analyses were informed on the endemic species that originated in the basin before the formation of Lake Ohrid 1.36 Ma ago and lacked a robust age-control. They, therefore, did not factor in as speciation events, but the respective 11 extinction events were considered, as they occurred 1.36 to 0 Ma ago. We obtained posterior estimates of speciation and extinction times for each species using Markov Chain Monte Carlo (MCMC) sampling, which we ran for 5,000,000 iterations, sampling every 1000 iterations and omitting the first 100,000 iterations as burn-in.

The accumulation of endemic species richness through time is the cumulative sum of all origination minus the cumulative sum of all extinction times in a single MCMC sample. Endemic richness was averaged over all 100 BDS analyses, and the uncertainty was summarized by the highest posterior density intervals with probability masses of 0.05 to 0.95 incremented by 0.05.

For the 202 nonendemic species that were found 1.36 to 0 Ma ago, we did not use a birth-death model because the taxa are likely to have originated elsewhere, thus making their first appearance in Lake Ohrid through dispersal, not speciation. Using taxa occurrences binned into intervals of 0.3 Ma, their richness through time was reconstructed by estimating per-lineage dispersal (d), local extinction (μ), and sampling (q) rates by a variant of the PyRateDES model for fossil biogeography (56), assuming dispersal into Lake Ohrid from a source pool of constant size. Moreover, we extended the PyRateDES implementation by adding a Gamma-distributed heterogeneity in sampling (mirroring the model we used in the BDS analyses), continuous time-variable dispersal and local extinction rates, and parameter estimation via maximum likelihood (available on The same shifts in sampling rate as for the endemic species and an exponential relationship of dispersal and extinction rates with time were specified. Using the maximum likelihood estimates as a starting parameter for a Bayesian inference, we were able to determine the uncertainty in dispersal, local extinction, and sampling rates after sampling every 100 of 50,000 MCMC iterations and omitting the first 1000 iterations as burn-in. We then inferred the average nonendemic richness through time by running 1000 stochastic diversity simulations based on the estimated dispersal, local extinction, and sampling parameters, using the R package “simDES” 0.1 (57).

To test whether some taxonomic groups of diatoms are more likely to go extinct or to speciate, the strength and significance of a phylogenetic signal in the binary trait extinct versus extant were calculated. The D-statistic (18) indicates a phylogenetic signal if D is significantly smaller than 1 but significantly greater than 0. The hierarchical taxonomic classification of the endemic diatoms and the D implementation of the R package “caper” 1.0.1 (58) with 10,000 permutations were used to examine the existence, strength, and significance of the phylogenetic signal in extinction. Moreover, to test whether extinction and speciation through time altered the hierarchical taxonomic structure of the endemic diatom assemblages, the taxonomic diversity of the endemic species assemblage (59) was calculated as a richness measure, and mean pairwise distances (60) provide a measure of taxonomic clustering. Standardized effect sizes of both indices were obtained by comparing the observed values with their expected distribution according to a null model, which was constructed by shuffling the tip labels in the taxonomic tree to realize random assemblages with the same species richness as our empirical dataset. Ten random sets out of the speciation and extinction times inferred by each of the 100 BDS analyses were extracted, and the R package “picante” 1.7 (61) was used for the index calculation and to obtain 999 realizations of the null model for each of these 10 × 100 sets.

Per-lineage speciation (λ) and per-lineage extinction (μ) rates were constructed from the generations of the BDS analyses using equal-area binning (62) with three events per bin. Expected species longevity was calculated as the reciprocal of the per-lineage extinction rate (19).

The influence of species richness and climate and environmental conditions on diversification rates may change over time, for instance, due to inherent uncaptured changes of the system affecting the dynamic of the species assemblage (63). These confounding effects were incorporated by time-stratifying the analysis of covariate influence on speciation and extinction rates.

We estimated the temporal heterogeneity of speciation and extinction rates in endemic species from three random samples of speciation and extinction times drawn from each of the 100 BDS analyses. We analyzed the times of origination and extinction using a reversible jump (rj) MCMC (54) analysis. Marginal speciation and extinction rates were obtained from 2,000,000 rj-MCMC iterations with sampling every 2500 iterations after discarding the first 50,000 iterations as burn-in.

We explored the robustness of our inferred rate shifts against the incomplete taxon sampling using a simulation approach. We designed scenarios in which the missing species appear for the first time either very early or very late in the history of the lake and added these simulated first appearances to the observed speciation and extinction times. Because we do not know whether these missing species originated after independent colonization events or are the result of intralacustrine cladogenesis, we simulated the first occurrences through (i) a probabilistic sampling of first appearance times where the probabilities decreased or increased through time but sum to 1 (fig. S5A) and (ii) a Yule phylogenetic model (fig. S5B) with decreasing or increasing speciation rates and zero extinction (as all missing taxa are extant), implemented with the Gillespie algorithm in R. We used a relative change in probabilities of first appearance or speciation rate through time from −400 to +400% in steps of 200% to simulate 100 replicates of the missing taxa for each step and added each of these replicates to one random pick of the 300 datasets of speciation and extinction times. We then inferred shifts in speciation and extinction rates with the birth-death model and the colonization-speciation-death model in PyRate and calculated the difference between the age of the shifts inferred from the observed data and the ages obtained from the same data augmented by the simulated appearance of the missing taxa (fig. S5).

Using the same 3 × 100 random sets of origination and extinction times, we inferred whether species richness and climate and environmental conditions influenced speciation and extinction rates and whether the effects differ among time strata. For this, we used the univariate birth-death shift model (63) with linear effects implemented in PyRate. In addition, we implemented in PyRate models of (i) a fixed ecological limit to species diversity and (ii) dynamic, environmentally defined ecological limits to diversity (64) based on lake size indicators (i.e., grain size, δ18Olakewater, and quartz). This was done to test for a diversity-dependent decline in speciation rate and an increase in extinction rate that is independent or dependent on lake size, respectively. Diversity here refers to the sum of endemic and nonendemic species (fig. S3O) and had been rescaled to the range [0,1]. Before the analyses, all other covariates were scaled to a mean of zero and unit variance to compare their correlation strength γ. All unifactorial diversification models were ranked by their fit as quantified through Bayes Factors obtained by thermodynamic integration. This approach down-weighted the contribution of the data on posterior likelihoods via 10 stepping stones, each using 4,000,000 MCMC iterations, sampling every 10,000 iterations after discarding 20,000 iterations as burn-in.

The state of equilibrium diversity was inferred from the ecological limit to diversity (fig. S6, A and E) by predicting at which diversity level speciation and extinction rates equilibrate. Subsequently, we compared this level with the empirical diversity trajectory and calculated the 95% HPD.


Supplementary material for this article is available at

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: The Hydrobiological Institute in Ohrid (S. Trajanovski and G. Kostoski), the Hydrometeorological Institute in Tirana (M. Sanxhaku and B. Lushaj), and the Faculty of Geology and Mineralogy, University of Tirana (A. Grazhdani) provided logistic support for the scientific drilling campaign. Drilling was carried out by Drilling, Observation, and Sampling of the Earth’s Continental Crust (DOSECC). Computational resources were provided by the de.NBI Cloud within the German Network for Bioinformatics Infrastructure (de.NBI). G. L. Simpson contributed to the discussion on the paleoecological time series analyses, and W. Salzburger on evolutionary processes in ancient lakes. Funding: The SCOPSCO drilling project was funded by the International Continental Scientific Drilling Program (ICDP; 03-2009), the German Ministry of Higher Education and Research (BMBF; 03G0825A), the German Research Foundation (DFG; WI 1902/8, WI 1902/13 and WA 2109/11, WA 2109/13), the University of Cologne, the British Geological Survey (IP-1579-1115), the Italian Consiglio Nazionale delle Ricerche (CNR), the Swiss National Science Foundation (PCEFP3_187012; FN-1749), the Swedish Research Council (2019-04739), and the governments of the republics of North Macedonia and Albania. Author contributions: T.W., E.J., and T.H. designed the study. B.W. initiated and coordinated the SCOPSCO drilling project. T.H. conducted the analyses, supported by D.S. and K.E. A.F., H.V., E.J., A.C., Z.L., T.D., L.S., A.M., K.P., J.H.L., and M.J.L. contributed key datasets and helped with the interpretations of indicator data, supported by F.W.-C., J.M.R., N.O.-R., S.T., X.Z., J.H., N.L., and G.Z. C.A., B.V.B., K.E., T.A.N., C.R.M., B.S., F.P.W., and V.W. coordinated specific parts of the eco-evolutionary interpretations. S.K. and K.L. coordinated the seismic survey of Lake Ohrid. T.W. wrote the paper, with assistance from B.W., T.H., C.R.M., E.J., and B.V.B. All authors contributed to the discussion and interpretation of the data and provided comments and suggestions to the manuscript. Competing interests: The authors declare that they have no competing interests. Data and materials availability: The raw data generated for this study are provided as data S1 to S3. All other datasets generated and/or analyzed during the current study are available in the GitHub repository at All glass slides with the cleaned sediment samples containing diatoms have been deposited in the University of Giessen Systematics and Biodiversity Collection (UGSB), Giessen, Germany, and are available upon request. Voucher numbers are provided in data S1.

Stay Connected to Science Advances

Navigate This Article