Research ArticleEVOLUTIONARY BIOLOGY

Genomic evidence for two phylogenetic species and long-term population bottlenecks in red pandas

See allHide authors and affiliations

Science Advances  26 Feb 2020:
Vol. 6, no. 9, eaax5751
DOI: 10.1126/sciadv.aax5751

Abstract

The red panda (Ailurus fulgens), an endangered Himalaya-endemic mammal, has been classified as two subspecies or even two species – the Himalayan red panda (A. fulgens) and the Chinese red panda (Ailurus styani) – based on differences in morphology and biogeography. However, this classification has remained controversial largely due to lack of genetic evidence, directly impairing scientific conservation management. Data from 65 whole genomes, 49 Y-chromosomes, and 49 mitochondrial genomes provide the first comprehensive genetic evidence for species divergence in red pandas, demonstrating substantial inter-species genetic divergence for all three markers and correcting species-distribution boundaries. Combined with morphological evidence, these data thus clearly define two phylogenetic species in red pandas. We also demonstrate different demographic trajectories in the two species: A. styani has experienced two population bottlenecks and one large population expansion over time, whereas A. fulgens has experienced three bottlenecks and one very small expansion, resulting in very low genetic diversity, high linkage disequilibrium, and high genetic load.

INTRODUCTION

The delimitation of species, subspecies, and population is fundamental for insights into the biology and evolution of species and effective conservation management. Traditionally, species, subspecies, or population delimitation is based on reproductive isolation, geographic isolation, and/or morphological differences and does not consider the role of gene flow. The misclassification of basal taxa will result in erroneous or misleading conclusions about the species’ evolutionary history and adaptive mechanisms, and potentially inappropriate conservation management decisions for threatened species (1, 2).

The red panda (Ailurus fulgens), an endangered Himalaya-endemic mammal, was once widely distributed across Eurasia but is now restricted at the southeastern and southern edges of the Qinghai-Tibetan Plateau within an altitude range of 2200 to 4800 m (3). On the basis of differences in morphology (e.g., skull morphology, coat color, and tail ring) and geographic distribution (Fig. 1 and table S1), red pandas are classified into two subspecies, the Himalayan subspecies (A. f. fulgens Cuvier, 1825) and the Chinese subspecies (A. f. styani Thomas, 1902) (4, 5). Morphologically, the Chinese subspecies has much larger zygomatic breadth, the greatest skull length, stronger frontal convexity, more distinct tail rings, and redder face coat color with less white on it (Fig. 1) (5, 6). On the basis of these morphological differences, C. Groves even proposed that the two subspecies should be updated as two distinct species: the Himalayan red panda (A. fulgens) and the Chinese red panda (A. styani) (6). The Nujiang River is considered the geographic boundary between the two species (7). The Himalayan red panda is distributed in Nepal, Bhutan, northern India, northern Myanmar, and Tibet and western Yunnan Province of China, while the Chinese red panda inhabits Yunnan and Sichuan provinces of China. The subspecies or species classification has remained controversial largely due to the lack of genetic evidence, and their distribution boundary may also be inaccurate because of the morphological similarity of red pandas on both sides of the Nujiang River (6, 8, 9). For instance, the skull size and morphology of red pandas from southeastern Tibet were more similar to those of the Chinese red panda than the Himalayan red panda (6). Although previous studies attempted to use mitochondrial DNA or microsatellite markers to explore this problem, the very small sample size from the Himalayan red panda and the limited ability of the molecular markers resulted in failure to resolve the species delimitation (1012). Next-generation sequencing technology not only provides whole-genome data but also enables the identification of Y chromosome sequences in nonmodel animals, which were difficult to obtain previously (13, 14). Thus, it is now feasible to use whole genomes, Y chromosomes, and mitochondrial genomes to comprehensively delimit species, subspecies, and populations. Here, with sufficient sampling of the Himalayan red panda, we performed whole-genome resequencing, Y chromosome single-nucleotide polymorphism (SNP) genotyping, and mitochondrial genome assembly of wild red pandas covering most of the distribution ranges of the two species, aiming to clarify species differentiation, population divergence, demographic history, and the impacts of population bottlenecks on genetic evolutionary potential.

Fig. 1 Distinguishing morphological differences between two red panda species.

(A and C) The Chinese red panda. (B and D) The Himalayan red panda. (A and B) The face coat color of the Chinese red panda is redder with less white on it than that of the Himalayan red panda. (C and D) The tail rings of the Chinese red panda are more distinct than those of the Himalayan red panda, with the dark rings being more dark red and the pale rings being more whitish. Photo credit: (A) Yunfang Xiu, Straits (Fuzhou) Giant Panda Research and Exchange Center, China; does not require permission. (B) Arjun Thapa, Institute of Zoology, Chinese Academy of Sciences. (C) Yibo Hu, Institute of Zoology, Chinese Academy of Sciences. (D) Chiranjibi Prasad Pokheral, Central Zoo, Jawalkhel, Lalitpur, Nepal; does not require permission.

RESULTS AND DISCUSSION

Genomic evidence of two phylogenetic species in red pandas

We performed whole-genome resequencing for 65 wild red pandas, with an average of 98.7% genome coverage and 13.9-fold sequencing depth for each individual based on the red panda reference genome (belonging to the Chinese red panda) of 2.34 Gb (15). Using the SNP-calling strategy of the Genome Analysis Toolkit (GATK), we identified a total of 4,932,036 SNPs for further analysis (table S4). On the basis of the whole-genome SNPs, the phylogenetic tree, principal components analysis (PCA), and ADMIXTURE results revealed substantial genetic divergence between the two species, providing the first genomic evidence of species differentiation (Fig. 2, B to D). The middle Himalaya population (MH) belonging to the Himalayan red panda was first divergent from the populations of the Chinese red panda (Fig. 2, B and D). Furthermore, four distinct genetic populations were identified: MH (n = 18), eastern Himalaya-Gaoligong (EH-GLG, n = 3 and 13, respectively), Xiaoxiangling-Liangshan (XXL-LS, n = 12 and 8, respectively), and Qionglai (QL, n = 10) (Fig. 2, B to D; fig. S1; and table S5). The individual SLL1 is the only sampled red panda from the Saluli Mountains (SLL), and its genetic assignment implied gene flow between the SLL population and its adjacent XXL and GLG populations (Fig. 2C). Because of the very small sample size, SLL1 was excluded in any population-level analyses. Traditionally, MH, EH, and the GLG individuals at the western side of the Nujiang River were classified as the Himalayan red panda, while the GLG individuals at the eastern side of Nujiang River, XXL, LS, and QL belonged to the Chinese red panda (7). Our results did not support the Nujiang River as the species distribution boundary because the EH and part of the GLG population at the western side of the Nujiang River clustered into a genetic population with other GLG individuals at the eastern side (Fig. 2, B to D). This EH-GLG genetic clustering was supported by morphological evidence that the morphology of red panda skulls from southeastern Tibet (namely, the EH population in this study) was more similar to that of the Chinese red panda than the Himalayan red panda (6). In addition, two individuals from Myanmar (GLG5 and GLG6) also clustered within the EH-GLG genetic cluster, suggesting that the Myanmar population belongs to the Chinese red panda. Thus, we infer that the Yalu Zangbu River, the largest geographic barrier to dispersal between the two species, may be the potential boundary for species distribution (Fig. 2A), although additional samples need to be collected from Bhutan and India to verify this inference.

Fig. 2 Population genetic structure based on whole genomes, Y chromosome SNPs, and mitochondrial genomes of red pandas.

(A) The geographic distribution of wild red panda samples under the background of habitat suitability. Red, QL population; purple, XXL-LS population; blue, SLL population; pink, EH-GLG; dark red, MH. (B) Maximum likelihood phylogenetic tree based on whole-genome SNPs, with the ferret as the outgroup. The values on the tree nodes indicate the bootstrap support of ≥50%. (C) ADMIXTURE result based on whole-genome SNPs with K = 2 to 7. (D) PCA result based on whole-genome SNPs. (E) Network map based on eight Y chromosome SNP haplotypes. (F) Network map based on 41 mitochondrial genome haplotypes.

Within the Chinese red panda, we further found population genetic differentiation. EH-GLG first diverged with XXL-LS-QL and then QL separated from XXL-LS (Fig. 2, B and C). Notably, we did not detect genetic substructure within EH-GLG spanning the famous Three Parallel Rivers (Nujiang River, Lancangjiang, and Jinshajiang), suggesting that the three large rivers did not hinder the gene flow of red pandas. This result is consistent with data from microsatellite markers (12).

Our Y chromosome SNP and mitochondrial genome results also supported the substantial divergence between the two species (Fig. 2, E and F; figs. S2 and S3; and tables S6 to S8). The haplotype networks and phylogenetic trees of both eight Y chromosome SNP (Y-SNP) haplotypes from 49 male individuals and 41 mitochondrial genome haplotypes from 49 individuals showed that the MH haplotypes (Himalayan red panda) clustered together and separated from the haplotypes of the Chinese red panda, highlighting the notable genetic divergence between the two species. In summary, regardless of the whole-genome SNPs, Y-SNPs, or mitochondrial genomes, notable genetic differentiation was found between the two species. Our comprehensive investigations reveal two evolutionarily significant units in red pandas. Under the phylogenetic species concept (16), it is reasonable to propose two species: the Himalayan red panda (A. fulgens) and the Chinese red panda (A. styani). This phylogenetic species classification was supported by their morphological differences (6).

The Y chromosome SNP and mitochondrial genome results revealed a female-biased gene flow pattern in red pandas (Fig. 2, E and F). Within the Chinese red panda, we observed different phylogeographic patterns between the mitochondrial genome and Y chromosome. The distribution of mitochondrial haplotypes was mixed and was not associated with the geographic sources of the individuals. By contrast, the distribution of Y-SNP haplotypes demonstrated an obvious phylogeographic structure: The haplotypes of EH-GLG were separated from those of XXL-LS-QL, and no shared Y-SNP haplotypes were found. These contrasting phylogeographic patterns reflected a female-mediated historical gene flow, implying female-biased dispersal and male-biased philopatry in red pandas. This dispersal pattern differs from the male-biased dispersal found in most mammals (17) but is similar to that of another bamboo-eating mammal, the giant panda (18, 19).

Species/population demographic and divergence history

The pairwise sequentially Markovian coalescent (PSMC) analysis results showed that the demographic history of red panda could be traced back to approximately 3 million years (Ma) ago, and the two red panda species experienced obviously different demographic histories (Fig. 3A). The Chinese red panda from EH-GLG, XXL-LS, and QL experienced similar demographic trajectories: two population bottlenecks and one large population expansion. This species suffered from an obvious population decline approximately 0.8 Ma ago, which coincided with the occurrence of the Naynayxungla Glaciation (0.78 to 0.5 Ma ago). The population decline resulted in the first bottleneck approximately 0.3 Ma ago, mostly likely caused by the Penultimate Glaciation (0.3 to 0.13 Ma ago) (20). After the glaciations, the populations started to expand and reached a climax approximately 50 thousand years (ka) ago. Then, the arrival of the last glaciations again resulted in rapid population decline, and the second bottleneck occurred during the Last Glacial Maximum (~20 ka ago) (20).

Fig. 3 Demographic history, divergence, and admixture of two red panda species and their populations.

(A) PSMC analysis revealed different demographic histories of the two species, with a generation time (g) of 6 years and a mutation rate (μ) of 7.9 × 10−9 per site per generation. The time axis is logarithmic transformed. (B) Fastsimcoal2 simulation reconstructed the divergence, admixture, and demographic history of red panda species and populations. The time axis is logarithmic transformed, and the number of migrants per year between two adjacent populations is shown beside each arrow. (C) TreeMix analysis detected significant gene flow from the EH-GLG to XXL-LS populations. s.e., standard error.

The Himalayan red panda from MH underwent a demographic history differing from that of the Chinese red panda: three population bottlenecks and one small expansion (Fig. 3A). The difference began with the first population bottleneck approximately 0.25 Ma ago. In contrast to the subsequent population recovery of the Chinese red panda, the Himalayan red panda continued to decrease and then went through a second bottleneck approximately 90 ka ago. Afterward, the population started to increase very slowly, but soon the population again decreased due to the last glaciations. The PSMC results showed that even at the climax of population growth (~50 ka ago), the effective population size of the Himalayan red panda was only approximately 35% that of the Chinese red panda. In addition, the Bayesian skyline plot (BSP) analyses based on mitochondrial genomes indicated that both species experienced recent population declines most likely caused by the Last Glacial Maximum, supporting the PSMC results (fig. S4). The different demographic trajectories may result from geographic and climate differences. The Chinese red panda was mainly distributed in the Hengduan Mountains rather than the platform or adjacent edges of the Qinghai-Tibetan Plateau and thus might have suffered less impact of the Pleistocene glaciations. The interglacial warm climate and the vast region of the Hengduan Mountains might have facilitated the rapid population expansion of the Chinese red panda (3). By contrast, the Himalayan red panda lived in the adjacent southern edge of the Qinghai-Tibetan Plateau and might have suffered severe impact of the Pleistocene glaciations. Even during the interglacial period, the geographic proximity to glaciers and limited potential habitat might have restricted this species’ population recovery (21). In Holocene, the climate might have less impact on red panda populations (21), while increasing human activities became the main factor driving recent red panda population declines, which have been detected by microsatellite marker-based Bayesian population simulations (12).

We further uncovered the species/population divergence history using Fastsimcoal2 simulation. On the basis of the comparison of alternative population divergence models, we determined the best-support divergence/demography model (Fig. 3B, fig. S5, and table S9). The divergence between the Himalayan (MH) and Chinese red pandas (EH-GLG, XXL-LS, and QL) occurred 0.22 Ma ago, coincident with the first population bottleneck of the two species caused by the Penultimate Glaciation. Next, EH-GLG and XXL-LS-QL diverged 0.104 Ma ago. The divergence may have resulted from the widely unsuitable habitat located in the Daxueshan and SLL Mountains (21). Last, XXL-LS and QL diverged 26 ka ago, which was most likely caused by the Last Glacial Maximum. After the population divergence, MH, EH-GLG, and QL suffered from population decline, whereas XXL-LS experienced population growth. Asymmetrical gene flow was detected between adjacent divergent populations (Fig. 3B). After the early divergence between the two species, more gene flow occurred from the Chinese red panda to the Himalayan red panda. Regardless of historical or current gene flow, EH-GLG seemed to be the source population of gene flow with more gene flow into other adjacent populations, among the four genetic populations (Fig. 3B). This implies that EH-GLG might be the historical dispersal source of red pandas. TreeMix analysis also detected significant gene flow from EH-GLG to XXL-LS (Fig. 3C and fig. S6), consistent with the Fastsimcoal2 result.

Genomic variation, linkage disequilibrium, and genetic load

Whole-genome variation analysis revealed that EH-GLG had the highest genetic diversity (π = 6.994 × 10−4, θw = 5.271 × 10−4), whereas the Himalayan red panda (MH) had the lowest genetic diversity (π = 3.523 × 10−4, θw = 2.428 × 10−4) (Fig. 4A and table S10). Y-SNP and mitochondrial genomic variations also showed that the Himalayan red panda (MH) had the lowest genetic variations (Fig. 4A and table S10). Genome-wide linkage disequilibrium (LD) analysis demonstrated that the Himalayan red panda (MH) had higher level of LD and slower LD decay with a reduced R2 correlation coefficient becoming stable at a distance of approximately 100 kb, whereas the populations of the Chinese red panda exhibited rapid LD decay with a reduced R2 becoming stable at a distance of approximately 40 kb (Fig. 4B). The genomic variations and LD patterns imply different demographic histories of the two species and, in particular, reflect the genetic impacts of long-term population bottlenecks in the Himalayan red panda.

Fig. 4 Genetic variation, genetic load, and natural selection of red pandas.

(A) Genetic variations (nucleotide diversity) of different species and populations based on whole-genome SNPs, mitochondrial genomes, and Y chromosome SNPs. (B) LD of the four populations. (C) Ratios of homozygous derived deleterious or LoF variants to homozygous derived synonymous variants for different populations. The horizontal bars denote population means. (D) Distribution of θπ ratios (non-MH/MH) and Z(FST) values. Data points located to the left of the left vertical dashed lines and the right of the right vertical dashed lines (corresponding to the 5% left and right tails of the empirical θπ ratio distribution, respectively) and above the horizontal dashed line [the 5% right tail of the empirical Z(FST) distribution] were identified as selected regions for the MH (the Himalayan red panda, green points) and non-MH (the Chinese red panda, blue points) populations.

We further analyzed the relationship between demographic history and genetic loads carried by different red panda populations, as deleterious variations should be removed more efficiently in larger populations (22, 23). We investigated the distributions of four types of variations [loss of function (LoF), deleterious, tolerated, and synonymous mutations] in protein-coding genes. We found that the ratios of homozygous derived deleterious or LoF variants to homozygous derived synonymous variants were higher in the Himalayan red panda (MH) than in the Chinese red panda; by contrast, the ratios of nonhomozygous derived deleterious or LoF variants to nonhomozygous derived synonymous variants were comparable between the two species (Fig. 4C). This genetic load pattern showed that the Himalayan red panda experiencing long-term population bottlenecks carried more homozygous LoF and deleterious mutations and thus suffers a higher risk of continuing population decline.

Genomic signatures of selection and local adaptation

Considering that the two red panda species live in different geographic ranges and climate environments and experienced long-term genetic divergence, we mainly focused on the identification of genomic signatures of selection and local adaptation between the two species. Using FST and θπ methods, we identified 146 genes with top 5% maximum FST values and top 5% minimum θπ1π2 values in the Himalayan red panda (MH) (Fig. 4D and table S11). The functional enrichment found that some genes were enriched in the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of vascular smooth muscle contraction (ko04270, P = 1.18 × 10−8) and melanogenesis (ko04916, P = 2.36 × 10−4) and the gene ontology (GO) term of positive regulation of endothelial cell proliferation (GO:0001938, P = 0.0197) (tables S12 and S13). The selection of these genes might be related with the distinct coat color of the Himalayan red panda and the adaptation to hypoxia and microclimate in high-elevation habitat (6).

In the Chinese red panda (EH-GLG, XXL-LS, and QL), we identified 178 genes under selection (Fig. 4D and table S14), which were partly enriched in the nonhomologous end-joining pathway (ko03450, P = 9.89 × 10−3) and the GO terms of regulation of response to DNA damage stimulus (GO:2001020, P = 3.35 × 10−3), cellular response to x-ray (GO:0071481, P = 3.69 × 10−3), double-strand break repair via nonhomologous end joining (GO:0006303, P = 0.0189), endothelial cell differentiation (GO:0045446, P = 0.0187), and regulation of response to oxidative stress (GO:1902882, P = 0.021) (tables S15 to S16). These selected genes were most likely involved in the adaptation to high ultraviolet radiation and hypoxia and microclimate in the Hengduan Mountains where the Chinese red panda mainly lives. Considering the recent divergence (0.22 Ma ago) between the Himalayan and Chinese red pandas, the ancestor of the two species should have adapted to a high-elevation environment before divergence because the latest and most significant uplift of the Qinghai-Tibetan Plateau have occurred 1.1 to 0.6 Ma ago and caused the altitude to increase up to 3000 m (24). Finding their common genetic mechanisms for high-elevation adaptation proved to be difficult based on our comparison of population genome data. The above functional enrichment results more likely reflected the adaptation of both red pandas to the local microclimate and habitat environment. Recent study showed that the two red panda species have separate climatic spaces dominated by precipitation-associated variables in the Himalayan red panda and by temperature-associated variables in the Chinese red panda (21).

CONCLUSIONS

Our analyses of whole genomes, Y chromosomes, and mitochondrial genomes revealed substantial genetic differentiation between the Himalayan and Chinese red pandas and provide the most comprehensive genetic evidence of species delimitation. When combined with previously identified morphological differences (6), the classification of two phylogenetic species is well defined. Our genomic evidence rejected the previous viewpoint of the Nujiang River as the species distribution boundary and revealed that the red pandas living in southeastern Tibet and northern Myanmar belong to the Chinese red panda, while the red pandas inhabiting southern Tibet belong to the Himalayan red panda together with the Nepalese individuals. We infer that the Yalu Zangbu River is most likely the geographic boundary for species distribution because this river is the largest geographic barrier between the two species. However, further verification with samples from Bhutan and India is needed. The delimitation of two red panda species has crucial implications for their conservation, and effective species-specific conservation plans could be formulated to protect the declining red panda populations (25). For a long time, the unclear status of species classification and distribution boundary hindered the scientific design of conservation measures. Because of the wrong distribution boundary, the EH-GLG population was split to belong to two species, which could result in inappropriate conservation measures for EH-GLG population and possibly detrimental interbreeding between the two species in captivity. Within the Chinese red panda, our results revealed three genetic populations: EH-GLG, XXL-LS, and QL, suggesting three management units for scientific conservation. In particular, the EH-GLG population spans southeastern Tibet and northwestern Yunnan of China, northern Myanmar, and northeastern India, which needs transboundary international cooperation for effective conservation. The QL population has the lowest genomic diversity and thus needs more attention to the conservation of its genetic evolutionary potential.

Our findings uncover the genetic impacts of long-term population bottlenecks in the Himalayan red panda, thus providing critical insights into the genetic status and evolutionary history of this poorly understood species. The long-term population bottleneck severely impaired its genetic evolutionary potential, resulting in the lowest genetic diversity but higher genetic load. The Himalayan red panda was estimated to have a small population size (26), and thus maintaining and increasing this species’ population size and genetic diversity are critical for their long-term persistence. In particular, the Himalayan red panda population spans southern Tibet of China, Nepal, India, and Bhutan, which needs urgent transboundary international cooperation to protect this decreasing species.

Our findings reveal that in addition to Pleistocene glaciations and recent human activity, female-biased gene flow has played an important role in shaping the demographic trajectories and genetic structure of red pandas. As a Himalaya-endemic species, our findings will also help understand the phylogeographic patterns of fauna distributed in the Himalaya-Hengduan Mountains biodiversity hotspot.

MATERIALS AND METHODS

Sample collection

We collected blood, muscle, and skin samples of 65 wild red pandas from seven main geographic populations for whole-genome resequencing. Of the 65 individuals, 18 individuals were from the middle Himalayan Mountains (MH), 3 from the eastern Himalayan Mountains (EH), 13 from the Gaoligong Mountains (GLG), 1 from the Saluli Mountains (SLL), 12 from the Xiaoxiangling Mountains (XXL), 8 from the Liangshan Mountains (LS), and 10 from the Qionglai Mountains (QL) (Fig. 2A and table S2). For Y chromosome SNP genotyping, we first used red panda–specific sex determination primers (27) to identify the sexes of the available wild samples. As a result, 49 wild male red pandas were used, including 13 from the MH population, 2 from EH, 10 from GLG, 8 from XXL, 5 from LS, and 11 from QL (table S2). For mitochondrial genome assembly, we successfully assembled 49 complete mitochondrial genomes from the whole-genome resequencing data for 49 of 65 wild red pandas, including 13 from MH, 2 from EH, 9 from GLG, 12 from XXL, 4 from LS, and 9 from QL (table S2).

Whole-genome resequencing and SNP calling

We extracted genomic DNA from blood, muscle, and skin samples using the QIAGEN DNeasy Blood & Tissue Kit. Then, we constructed genomic libraries of insert size 200 to 500 base pairs and performed genome resequencing of the average 10× for each individual using the Illumina HiSeq 2000 and X Ten sequencing platforms (table S3). To identify population-level SNPs, the Illumina sequencing reads were aligned to the red panda reference genome (15) with Burrows-Wheeler Alignment (BWA) tool v0.7.8 (28), and polymerase chain reaction (PCR) duplicates were removed by SAMtools v0.1.19 (29). The UnifiedGenotyper method in GATK v3.1-1-g07a4bf8 software (30) was used for SNP calling with default parameters across the 65 individuals. To obtain reliable SNP, we performed a filtering step with the following set of parameters: depth ≥ 4, MQ ≥ 40, FS ≤ 60, QD ≥ 4, maf ≥ 0.05, and miss ≤ 0.2.

Identification and genotyping of Y chromosome SNPs

Previously, we de novo sequenced a wild male red panda genome (15), which enabled us to develop Y chromosome SNPs. Using a genome synteny searching strategy and the female dog genome (boxer breed) and the dog male-specific Y chromosome sequences (Doberman breed) as the reference, Fan et al. recently identified a set of nine male-specific Y chromosome scaffolds with a total length of 964 kb from the male red panda genome assembly (table S5) (31). Using the 964-kb male-specific Y chromosome scaffolds as the reference, we aligned the whole-genome resequencing reads of 18 male red pandas to the reference genome using BWA and then performed SNP calling using SAMtools and GATK. As a result, a total of 63 Y-SNPs were identified. Furthermore, we screened 22 Y-SNPs with confirmed polymorphism and good PCR/sequencing performance. Then, we genotyped these Y-SNPs for a total of 49 male red pandas. With the genotyping of more individuals, we found five additional Y-SNPs. As a whole, the dataset of 49 male red pandas with 27 Y-SNPs was used for subsequent paternal population genetics analysis (tables S2, S6, and S7).

Mitochondrial genome assembly

We used the Assembly by Reduced Complexity method (32) to assemble mitochondrial genome with the published red panda mitochondrial genome as a reference (33) (GenBank accession: AM711897). First, the sequencing reads of each of the 65 red pandas were mapped onto the mitochondrial genome reference. Second, the mitochondrial genome reference was classified into multiple bins, and the alignment results were used to distribute reads into specific bins. Third, assembly was performed for each bin to produce contigs. Last, the initial reference was replaced with assembled contigs, and the above processes were iterated until stopping criteria have been met (32). The mitochondrial genome sequence used lastly excluded the highly repetitive sequences within the D-loop region.

Population genetic structure based on whole genomes, Y-SNPs, and mitochondrial genomes

We conducted PCA for whole-genome SNPs using the program GCTA v1.24.2 (34). A maximum likelihood phylogenetic tree was constructed by RAxML software (35) with the GTRGAMMA model and 100 bootstraps, and the ascertainment bias correction was performed to correct for the impact of invariable sites in the data. Ferret was used as the outgroup (36). Population genetic structure was inferred by ADMIXTURE v1.23 software (37) with default settings. We did not assume any prior information about the genetic structure and predefined the number of genetic clusters (K) from two to seven. We used POPART v1.7 (38) to construct a median-joining network for the Y-SNP haplotypes and mitochondrial genome haplotypes. We constructed the phylogenetic tree based on mitochondrial genomes of 15,238 bp (excluding the D-loop region) using BEAST v1.8.2 (39) with ferret as the outgroup. The best substitution model of GTR + I was selected on the basis of the Bayesian Information Criterion by ModelGenerator v0.85 (40). A strict clock rate was selected on the basis of the assessment of coefficient of variation. A total of 8 × 108 iterations were implemented with 10% burn-ins. The BEAST running results were assessed by Tracer v1.5 and were annotated by TreeAnnotator v1.10. We constructed the phylogenetic tree based on Y-SNPs data using the maximum likelihood method implemented in RAxML (35), with the ascertainment bias correction and ferret as the outgroup.

Species/population demographic and divergence history

To reconstruct the detailed demographic history of each red panda population, we applied the simulation PSMC v0.6.4-r49 (41) to the whole diploid genome sequences, with the following set of parameters: -N 30 –t 15 –r 5 -p 4 + 25*2 + 4 + 6. We excluded sex-chromosome sequences of the red panda genome by aligning the red panda genome with the dog genome. We selected two to three high-depth sequenced individuals from each population for PSMC analysis (table S3). We estimated the nucleotide mutation rate of red panda using ferret as the comparison species and the following formula: μ = D × g/2T, where D is the observed frequency of pairwise differences between two species, T is the estimated divergence time, and g is the estimated generation time for the two species (42). In this study, the generation time (g) was set to 6 years (26), the estimated divergence time was set to 39.9 Ma ago (15), and D was estimated to be 0.10558. On the basis of the above formula and the corresponding values, a mutation rate of 7.9 × 10−9 mutations per site per generation was estimated for the red panda. In addition, we performed BSP analyses based on mitochondrial genomes of 15,994 bp for two species separately, using BEAST v1.8.2. The best substitution model of HKY + I was selected by ModelGenerator v0.85. A strict clock rate was selected with a nucleotide substitution rate (43) of 1.9 × 10−8. A total of 8 × 108 iterations were implemented with 10% burn-ins. The BEAST running results were assessed, and the BSP plots were produced by Tracer v1.5.

We used the flexible and robust simulation-based composite-likelihood approach implemented in Fastsimcoal2 v2.5.2.21 (44) to infer species/population divergence and demographic history with the following parameters: -n 100000 -N 100000 -d -M 0.001 -l 10 -L 40 -q --multiSFS -C10 -c8. Because of the memory limit of Fastsimcoal2 running, we selected 55 individuals among 65 red pandas for simulation analysis (table S2). Four alternative population divergence and demographic models were explored. For each model, we ran the program 50 times with varying starting points to ensure convergence and retained the fitting with the highest likelihood. The best model was selected through the maximum value of the likelihoods. Parametric bootstrap estimates were obtained on the basis of 100 simulated data sets (table S9). In addition, we performed population-level admixture analysis for detecting gene flow among genetic populations using the TreeMix method (45) with the following running parameters: treemix –bootstrap –k 1000 –se –noss –m 1~5.

Estimate of genetic variations and LD

For whole-genome data, the nucleotide diversity (π) (46) and Watterson’s estimator (θw) (47) of each genetic population were calculated using VariScan v2.0.3 (48). A sliding window approach was used with a 50-kb window sliding in 10-kb steps. We estimated the genetic diversity for the mitochondrial genome data of 15,994 bp and Y-SNPs data using DNASP v5.10.01 (49). To assess the LD pattern in red pandas, the correlation coefficient (R2) between any two loci in each genetic population was calculated using vcftools v0.1.14 (50). Parameters were set as follows: --ld –window -bp 500000 –geno -r2. Average R2 values were calculated for pairwise markers with the same distance.

Assessment of genetic load

We used ANNOVAR (51) to annotate and classify the effects of SNP variants on protein-coding gene sequences. Then, the coding sequence variants were classified as LoF, missense, and synonymous variants. LoF variant denoted variants with gain of a stop codon. The missense variants were further categorized as deleterious and tolerated missense mutations by SIFT 4G (52). We determined the ancestral allele at each SNP position through comparison with the ferret genome (36). To detect the genetic load of each red panda population, for each individual, we counted the relative proportions of homozygous ancestral, heterozygous, and homozygous derived alleles for LoF, deleterious, tolerated, and synonymous variants, respectively. Furthermore, we calculated the ratio of homozygous derived LoF variants (or deleterious variants) to homozygous derived synonymous variants and the ratio of nonhomozygous derived LoF variants (or deleterious variants) to nonhomozygous derived synonymous variants for each individual.

Genomic signatures of selection and local adaptation

In general, positive selection gives rise to lower genetic diversity within populations and higher genetic differentiation between populations (53). The genetic differentiation index FST (54) and the average proportion of pairwise mismatches over all compared sequences θπ (55) have been widely used to detect selection (53). To detect selection signals possibly associated with local adaptation, we used a sliding-window method (50-kb windows with 25-kb increments) to calculate the genome-wide distribution of FST values and θπ ratios for the two species, implemented in vcftools v0.1.14. We applied z transformation for FST values and log2 transformation for θπ ratios and considered the windows with the top 5% Z(FST) and log2π ratio) values simultaneously as the candidate outliers under strong selection. All outlier windows were assigned to corresponding SNPs and genes. We used the GeneTrail2 method (56) to perform KEGG pathway and GO term enrichment analyses for selected genes located in specific regions. Each significantly enriched category included at least two genes, and the hypergeometric test was used to estimate significance (P < 0.05).

SUPPLEMENTARY MATERIALS

Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/6/9/eaax5751/DC1

Fig. S1. PCA plot of red panda whole-genome SNPs data, with PC1, PC2, and PC3 explaining 28.5, 4.1, and 3.6% of the observed variations, respectively.

Fig. S2. Phylogenetic tree based on 41 mitochondrial genome haplotypes, showing two significant species lineages (A. fulgens and A. styani).

Fig. S3. Phylogenetic tree based on eight Y chromosome SNPs haplotypes, showing two significant species lineages (A. fulgens and A. styani).

Fig. S4. Bayesian skyline plot (BSP) analysis results based on mitochondrial genomes.

Fig. S5. Four alternative population divergence models for Fastsimcoal2 simulations, with the maximum estimated likelihood values shown.

Fig. S6. Residual fit from the maximum likelihood tree estimated by TreeMix.

Table S1. Summary of the morphological differences between the Himalayan and Chinese red pandas.

Table S2. Sample information for whole-genome resequencing, Y chromosome SNP genotyping, mitochondrial genome assembly, and Fastsimcoal2 analysis.

Table S3. Summary of whole-genome resequencing data for 65 red panda individuals that include the individuals for PSMC analysis.

Table S4. Summary of SNP calling based on 65 red panda individuals.

Table S5. Cross-validation error result for varying values of K in the ADMIXTURE analysis.

Table S6. PCR primer information for validating the six male-specific Y-scaffolds of red pandas.

Table S7. PCR primer information for amplifying the SNPs on the male-specific Y-scaffolds.

Table S8. Eight Y-SNP haplotypes identified from 27 Y-SNPs of 49 male red panda individuals.

Table S9. Confidence intervals of key parameters for the best population divergence and demographic model estimated by Fastsimcoal2.

Table S10. Genetic diversity of whole genome, Y chromosome, and mitochondrial genome for different species and populations of red pandas.

Table S11. The 146 genes under selection with top 5% maximum FST values and top 5% minimum θπ1π2 values in the Himalayan red panda (MH).

Table S12. Significantly enriched KEGG pathways for the 146 genes under selection in the Himalayan red panda (MH).

Table S13. Significantly enriched GO terms of biological processes for the 146 genes under selection in the Himalayan red panda (MH).

Table S14. The 178 genes under selection with top 5% maximum FST values and top 5% minimum θπ1π2 values in the Chinese red panda (EH-GLG, XXL-LS, and QL).

Table S15. Significantly enriched KEGG pathways for the 178 genes under selection in the Chinese red panda (EH-GLG, XXL-LS, and QL).

Table S16. Significantly enriched GO terms of biological processes for the 178 genes under selection in the Chinese red panda (EH-GLG, XXL-LS, and QL).

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.

REFERENCES AND NOTES

Acknowledgments: We thank Qing You for help with genomic data analysis. We also thank the Department of National Park and Wildlife Conservation for granting research permit in protected areas in Nepal and Center for Biotechnology of Agriculture and Forestry University for providing laboratory space and equipment for molecular sample process. We also thank Dunwu Qi and Peng Chen for help with sample collection. Funding: This study was funded by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB31000000), the National Natural Science Foundation of China (31821001, 31470441, 31822050, and 91531302), the Biodiversity Survey, Monitoring and Assessment Project of Ministry of Ecology and Environment of China (2019HB2096001006), the Second Tibetan Plateau Scientific Expedition and Research Program (2019QZKK0501), and the Youth Innovation Promotion Association, CAS (2016082). Author contributions: F.W. conceived and supervised the project. Y.H., A.T., and T.M. performed the sample collection and DNA preparation. Y.H., A.T., H.F., Q.W., and S.M. performed the data analyses. Y.H., D.Z., B.W., M.L., and L.Y. performed the molecular experiments. Y.H. wrote the manuscript with input from F.W. and A.T. Competing interests: The authors declare that they have no competing interests. Data and materials availability: The raw genome resequencing data and mitochondrial genome assemblies reported in this paper have been deposited in the Genome Sequence Archive in Beijing Institute of Genomics (BIG) Data Center, BIG, Chinese Academy of Sciences, under accession number CRA001419 (raw genome resequencing data) and GWHAAGL00000000–GWHAAHZ00000000 (mitochondrial genome assemblies) that are publicly accessible at http://bigd.big.ac.cn/gsa. All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.
View Abstract

Stay Connected to Science Advances


Editor's Blog

Navigate This Article