Skip to main content

Genomic evidence for low genetic diversity but purging of strong deleterious variants in snow leopards

Abstract

Background

Long-term persistence of species with low genetic diversity is the focus of widespread attention in conservation biology. The snow leopard, Panthera uncia, is a big cat from high-alpine regions of Asia. However, its subspecies taxonomy, evolutionary history, evolutionary potential, and survival strategy remain unclear, which greatly hampers their conservation.

Results

We sequence a high-quality chromosome-level genome of the snow leopard and the genomes of 52 wild snow leopards. Population genomics reveal the existence of two large genetic lineages in global snow leopards, the northern and southern lineages, supported by the biogeography. The Last Glacial Maximum drove the divergence of two lineages. Microclimate differences and large rivers between the western and central Himalayas likely maintain the differentiation of two lineages. EPAS1 is positively selected in the southern lineage with almost fixed amino acid substitutions and shows an increased allele frequency with elevation. Compared to the southern lineage, the northern lineage exhibits a lower level of genomic diversity and higher levels of inbreeding and genetic load, consistent with its recent population decline. We find that snow leopards have extremely low genomic diversity and higher inbreeding than other Carnivora species; however, strong deleterious mutations have been effectively purged in snow leopards by historical population bottlenecks and inbreeding, which may be a vital genetic mechanism for their population survival and viability.

Conclusions

Our findings reveal the survival strategy of a species with low genetic diversity and highlight the importance of unveiling both genetic diversity and genetic burden for the conservation of threatened species.

Peer Review reports

Background

The snow leopard (Panthera uncia), an elusive and vulnerable feline mammal, is endemic to the Qinghai–Tibet Plateau and surrounding mountain ranges that span 12 countries. It is also the largest felid inhabiting the highest elevations on Earth. The global population size of wild snow leopards is estimated at 4678–8745 [1], with approximately 50% of them residing in China [2]. Despite being downlisted from “Endangered” to “Vulnerable” by the IUCN in 2017, snow leopards continue to face significant threats from anthropogenic impacts, including climate change [3], habitat fragmentation [4], and human activities [5, 6].

Clear intraspecific taxonomy is the basis of scientific conservation measures. However, there has been a long controversy about the intraspecific taxonomy of snow leopards. Kitchener et al. [7] proposed that the global snow leopards belong to one population without subspecies classification. Janecka et al. [8] classified the global snow leopards into three subspecies: northern subspecies (the Altai–Sayan regions), western subspecies (Tian Shan, Pamir, trans-Himalaya regions), and central subspecies (core Himalaya and Tibetan Plateau regions) after analyzing 33 microsatellite loci from 70 individuals. The unclear intraspecific taxonomy would hinder the design of scientific conservation measures, such as classification of management units, animal reintroduction, and captive breeding. Furthermore, the genetic status of wild snow leopards on a global scale remains largely unknown. Although some regional conservation genetic studies and the assessment of genomic heterozygosity based on a single resequenced individual have been performed [9, 10], the level and pattern of global-scale population genetic diversity remain unclear. Moreover, as the largest felid that is distributed at the highest elevation, the adaptive mechanisms of snow leopards to extremely high-elevation environments have been a focus in evolutionary biology [9, 11, 12].

In this study, we performed a conservation genomics study of global snow leopards and aimed to clarify their population genetic structure, endangerment history, and potential molecular mechanisms of plateau adaptation. Furthermore, we explored their genetic evolutionary potential and population viability from the perspectives of genomic variation, inbreeding, and genetic load through intra- and interspecies comparative analyses, which advances our understanding beyond the previous microsatellite-based conservation genetics study [8].

Results

Genomic evidence of two large genetic lineages in global snow leopards

Based on our chromosome-scale de novo genome assembly of a wild female snow leopard (see Additional file 1), we performed population-level genome resequencing of 52 wild snow leopard samples, including 34 samples with geographic locations and 18 samples with unknown sources (Additional file 2: Table S1). The 34 samples with geographic sources came from four countries: China (n = 22), Mongolia (MG, n = 5), Russia (RU, n = 3), and Tajikistan (TA, n = 4). Chinese snow leopard samples were collected from six provinces, including Qinghai (QH, n = 13), Gansu (GS, n = 1), Sichuan (SC, n = 1), Tibet (TB, n = 4), Xinjiang (XJ, n = 2), and Inner Mongolia (IM, n = 1) (Fig. 1A, Additional file 2: Table S1). In addition, the published genome resequencing data of one snow leopard from western Mongolia [9] was used together with our own data set. Totally, we collected 53 snow leopard samples including 50 modern samples and 3 historical samples (Additional file 2: Table S1). Among the 35 samples with known geographic locations, 19 belonged to the central subspecies proposed by Janecka et al. [8], 10 belonged to the northern subspecies, and 6 belonged to the western subspecies (Additional file 2: Table S1). Based on 50 modern samples, we identified ~ 1.48 million whole-genome single nucleotide polymorphisms (SNPs) after stringent quality filtering, and the average genome sequencing depth of modern samples was 21.9 × (Additional file 2: Tables S2–S3).

Fig. 1
figure 1

Population genetic structure of the global snow leopards based on whole-genome SNPs. A Geographic distribution of wild snow leopard samples in this study. Herein, only samples with known geographical sources were presented (Table S1). Triangles indicate the southern lineage, while circles represent the northern lineage. Location abbreviations are as follows: RU (Russia), MG (Mongolia), TA (Tajikistan), IM (Inner Mongolia, China), XJ (Xinjiang, China), TB (Tibet, China), QH (Qinghai, China), GS (Gansu, China), and SC (Sichuan, China). B Neighboring-joining phylogenetic tree of snow leopards based on whole-genome SNPs, with the tiger as an outgroup. Samples marked in brown represent the northern lineage, while those in blue represent the southern lineage. The values on the tree nodes indicate the bootstrap support. C ADMIXTURE analysis of snow leopards based on whole-genome SNPs, with K values from 2 to 4. The model with K = 2 exhibited the lowest cross-validation error rate. D Principal component analysis (PCA) of snow leopards based on whole-genome SNPs

Subsequently, using whole-genome SNPs, the results of the phylogenetic tree, ADMIXTURE, and principal component analysis (PCA) clearly revealed two distinct large genetic lineages (Fig. 1B–D). Specifically, the phylogenetic tree showed that the snow leopards from Tajikistan, Mongolia, Russia, and Xinjiang of China were clustered as one genetic lineage (northern lineage), whereas individuals from Tibet, Qinghai, Sichuan, and Gansu of China were clustered as the other genetic lineage (southern lineage) (Fig. 1B, Additional file 3: Fig. S1). In ADMIXTURE analysis, we found that the clustering pattern was similar to that of the phylogenetic tree (Fig. 1C), showing two significant genetic clusters that were also supported by the lowest cross-validation error rate when K = 2 (Additional file 3: Fig. S2). The PCA results also demonstrated two distinct genetic clusters (Fig. 1D, Additional file 3: Fig. S3). Notably, the phylogenetic trees of 50 modern and 3 historical samples showed that Tajikistan individuals from the Pamir Plateau first clustered, and Mongolian and Russian individuals from the Altai Mountains clustered with Xinjiang individuals from the Tianshan Mountains, which were different from the “western subspecies” and “northern subspecies” proposed by Janecka et al. [8] (Fig. 1B, Additional file 3: Fig. S4). Moreover, we performed ADMIXTURE analyses for the two distinct genetic lineages separately and found four sub-lineages: two (Tibet–Sichuan–southern Qinghai and Northern Qinghai–Gansu) within the southern lineage and two (Tajikistan and Mongolia–Russia) within the northern lineage were observed, although they were not supported by cross-validation error rate analyses (Additional file 3: Figs. S5–S6). In addition, we also assessed the relatedness among these samples and removed one of the first-degree close relatives. As a result, the above genetic clustering results did not change. Interestingly, snow leopards from the Tianshan Mountains (Xinjiang, China) demonstrated a recent admixture between Tajikistan and Mongolia–Russia populations (Additional file 3: Fig. S5); this was consistent with the result by Korablev et al. [13], who showed that in the Kyrgyzstan snow leopard population, a recent admixture occurred between the Mongolia–Russia and Tajikistan populations.

Furthermore, FST analysis based on whole-genome SNPs estimated that FST between the two genetic lineages was 0.126 (Additional file 2: Table S4), suggesting a middle-level genetic differentiation that was lower than the differentiation levels of closely related species measured in other studies [14]. In our study, the northern lineage included the “northern subspecies” and the “western subspecies” recognized by Janecka et al. [8], whereas the southern lineage corresponded to the “central subspecies” recognized by Janecka et al. [8]. The geographic distribution of these two genetic lineages broadly aligns with known biogeographic barriers in the region. The most likely geographic boundary of the two lineages, based on the geographic locations of all 35 samples and published related studies [8, 12], was from the Gobi desert between Mongolia and China and the Taklimakan desert to the adjacent region between the western Himalayas (Pamir–Karakorum) and the central Himalayas (core Qinghai–Tibet Plateau/Himalayas) (Additional file 3: Fig. S7). Notably, this boundary was an important biogeographical boundary for multiple local species/subspecies, such as between Himalayan brown bear (Ursus arctos isabellinus) and Tibetan brown bear (U. a. pruinosus) [15] (Additional file 3: Fig. S8), and between the blue sheep (Pseudois nayaur) and Siberian ibex (Capra sibirica) (Additional file 3: Fig. S8), the main foods of snow leopards from the two genetic lineages separately, indicating the strong barrier role of this biogeographical boundary.

Demography, divergence histories and habitat dynamics

To reconstruct the demographic history of snow leopards from ancient times to the present, three methods including pairwise sequential Markovian coalescent (PSMC) [16], stairway-plot2 [17], and GONE [18] were performed. We found that the demographic histories of the two snow leopard lineages were generally similar and can be traced back to ~ 700 thousand years ago (Kya) to 10 Kya (Fig. 2A). Specifically, the effective population sizes (Ne) of both genetic lineages have been decreasing historically. The stairway-plot2 analysis revealed the demographic history of snow leopards from ~ 100 Kya to 200 years ago, and both lineages experienced a severe population bottleneck of approximately ~ 13 Kya (Fig. 2B), which coincided with the end of Last Glacial Maximum (LGM). Notably, the northern lineage experienced a greater degree of population bottleneck than the southern lineage, and the northern lineage declined again by approximately 1 Kya after the population rebounded, whereas the southern lineage maintained a relatively stable population size after its population recovery (Fig. 2B). Furthermore, the recent demographic histories of two lineages simulated by GONE revealed that the Ne of the southern lineage was approximately twice that of the northern lineage between 100 and 20 generations ago (from ~ 500 to 100 years ago), a pattern largely consistent with the recent demographic results of the stairway-plot2 analysis (Additional file 3: Fig. S9).

Fig. 2
figure 2

Demography, divergence histories, and habitat dynamics of snow leopards. A PSMC analysis revealed the demographic history of snow leopards from approximately 700 to 10 Kya with a generation time (g) of 5 years and a mutation rate (μ) of 4.94 × 10−9 per site per generation. The two yellow blocks represent the Penultimate Glaciation (PG) and the Last Glacial Maximum (LGM), respectively. B Stairway-plot2 analysis showed distinct recent demographic trends for the two snow leopard lineages. The Ne of the northern lineage continued to decline after the population rebounded, whereas the southern lineage maintained a relatively stable population after its population recovery. The black dashed line represents the approximate divergence time of the two lineages. C The divergence histories constructed by Fastsimcoal27 indicated that the two snow leopard lineages diverged approximately 25 Kya, coinciding with the appearance of the LGM. DG Simulation analysis of the habitat dynamics for four periods (Current, Middle-Holocene, LGM, and Last Interglacial period)

To estimate the population divergence history of the two genetic lineages, Fastsimcoal2 simulation [19, 20] was used. The divergence between the southern and northern lineages occurred at ~ 25 Kya (Fig. 2C), which coincided with the appearance of the LGM, suggesting that the LGM may be the main factor driving the divergence between the two lineages. The present estimated Ne of the southern lineage was 1368, which is approximately twice that of the northern lineage (599) (Fig. 2C, Additional file 2: Table S5), which is also consistent with the results of stairway-plot2 and GONE. Furthermore, the gene flow from the southern to the northern lineage was approximately twice that from the northern to the southern lineage.

To reconstruct the habitat dynamics of snow leopards in response to historical climate changes, MaxEnt simulation [21] was performed based on 719 occurrence record data (Additional file 2: Table S6). The simulation results showed that the high-suitability habitat of snow leopards during the LGM period was only approximately 50% of that in the current period (Fig. 2D and F). Compared to the other three periods (Current, Middle-Holocene, and Last Interglacial Glaciation), the habitat fragmentation of the Qinghai–Tibet Plateau region during the LGM period was the severest, with obvious habitat loss in the northwestern part of the region (Fig. 2D–G). These results indicate that habitat loss during the LGM period may have contributed to the divergence of the two lineages, which was consistent with the results of the Fastsimcoal2 simulation.

Adaptation to high elevation in snow leopards

The southern lineage is located mainly in the alpine regions of the Qinghai–Tibet Plateau and the Hengduan Mountains, whereas the northern lineage is mainly distributed in the Altai Mountains, Tianshan, Pamir, and western Himalayas. Altogether, the southern lineage lives at higher elevations than the northern lineage [22, 23]. In order to identify the signatures of selection and local adaptation in the two lineages, we performed natural selection analysis using FST and θπ methods. Following the screening criteria for the top 5%, we identified 267 and 552 genes positively selected in the northern and southern lineages, respectively (Additional file 2: Tables S7–S8). For the southern lineage, the functional enrichment results of 552 positively selected genes were mainly involved with HIF-1 signaling pathway, blood circulation, hypertrophic cardiomyopathy, protein phosphorylation regulation, and lipid catabolism (Fig. 3A). In particular, seven genes (EPAS1, CAMK2B, CAMK2D, GAPDH, PDHA1, PRKCA, and MAPK1) were identified within the HIF-1 signaling pathway which is known to be related with hypoxic adaptation. For the northern lineage, the 267 positively selected genes were functionally enriched in pathways mainly related to DNA repair regulation, osteoblast signaling, protein ubiquitination, secretion, synapse structure, neuron migration, locomotion, and behavior (Fig. 3B).

Fig. 3
figure 3

Positively selected gene (EPAS1) associated with plateau adaptation in snow leopards. A Functional enrichment results of 552 positively selected genes in the southern lineage. B Functional enrichment results of 267 positively selected genes in the northern lineage. C Two linked missense mutations of EPAS1 among snow leopard, tiger, lion, leopard, jaguar, clouded leopard, and cat. Multi-species sequence alignment of EPAS1 showed that the derived alleles A and C occur only in snow leopards. D Allele frequency map of two EPAS1 missense variants in global snow leopards. The derived allele frequencies of both mutations were positively correlated with the altitude distribution of snow leopards. Specifically, the frequency pattern of two derived alleles (A and C) was Qinghai–Tibet Plateau > Tianshan–Pamir Plateau > Altai–Mongolian Plateau, suggesting that EPAS1 was most likely involved in adaptation to low oxygen in the high-altitude regions

Among the seven genes identified in the HIF-1 signaling pathway, EPAS1 (endothelial PAS domain protein 1, approximately 85.5 kb in length with 16 exons in snow leopards) is a well-known gene involved in hypoxia adaptation [24, 25]. We further confirmed that this gene was under positive selection through two additional selection analysis methods, XP-CLR and XP-EHH (Additional file 2: Table S9). We identified 158 variants in EPAS1, including 2 missense SNPs, 2 synonymous SNPs, 7 UTR SNPs, and 147 intron SNPs. We also found that the mutation ratio of EPAS1 intron variants was significantly higher than that of other genes in the entire genome (Additional file 2: Table S10, Additional file 3: Fig. S10), suggesting that this gene was under strong positive selection in the southern lineage. Two fully linked missense mutations were located in exons 12 and 15, respectively, and the corresponding amino acid substitutions were 671G/A-Val/Ile and 804T/C-Cys/Arg, which were also found in two previous studies [9, 12]. The multi-species sequence alignment of EPAS1 revealed that the derived alleles A and C occur only in snow leopards (Fig. 3C). We found that the two missense mutations (A and C) have almost fixed in the southern lineage with the allele frequency being 0.969, but not in the northern lineage with the allele frequency being only 0.344 (Additional file 2: Table S11). Overall, the frequencies of alleles A and C were positively correlated with the altitude distribution of snow leopards. Specifically, the frequency pattern of both alleles was Qinghai–Tibet Plateau > Tianshan–Pamir Plateau > Altai–Mongolian Plateau (Fig. 3D, Additional file 2: Table S11), regardless of the lineage distribution, highlighting that EPAS1 was most likely involved in adaptation to low oxygen in the high-altitude regions. Furthermore, in contrast to the balance selection hypothesis of Janecka et al. [12], our results did not support the balancing selection hypothesis of EPAS1 but detected positive selection of this gene in the southern lineage.

Genetic diversity, linkage disequilibrium, runs of homozygosity and genetic load in snow leopards

Our demographic inference analysis suggested a long-term population decline for the two snow leopard lineages (Fig. 2A) and the population continued to decline in the northern lineage after its bottleneck, but not in the southern lineage, which still maintained a relatively stable population size (Fig. 2B). To investigate whether different demographic trajectories of the two lineages led to different genetic diversity patterns, we explored the genome-wide genetic diversity, linkage disequilibrium (LD), runs of homozygosity (ROH), and genetic load of the two lineages. We noted that the average genetic diversity of the northern lineage (θπ = 1.66E − 04; genome-wide heterozygosity = 1.74E − 04) was significantly lower than that of the southern lineage (θπ = 1.88E − 04; genome-wide heterozygosity = 1.94E − 04) (Fig. 4A–B, Additional file 2: Tables S12–S13). The northern lineage had a higher level of LD than the southern lineage (Fig. 4C), which is consistent with the expectation of higher LD in small populations [26].

Fig. 4
figure 4

Comparison of genetic evolutionary potential between the two snow leopard lineages. AD The northern lineage exhibited lower genome-wide heterozygosity and population-level genomic diversity, along with higher levels of linkage disequilibrium (LD) and inbreeding coefficients (Froh) compared to the southern lineage. E Statistics information for five types of ROH (50 ~ 100 kb; 100 ~ 500 kb; 500 kb ~ 1 Mb; 1 ~ 3 Mb; 3 ~ 5 Mb). Overall, the northern lineage displayed a greater number of ROHs across all categories compared to the southern lineage. F The northern lineage showed a higher level of historical inbreeding (ROH length < 1 Mb), while recent inbreeding (ROH length ≥ 1 Mb) levels were comparable between the two lineages. G A negative correlation was observed between Froh and genome-wide heterozygosity in snow leopards. HK The northern lineage harbored a higher genetic load compared to the southern lineage. This was demonstrated by a higher ratio of loss-of-function (LOF) to synonymous variants (H), a higher ratio of deleterious to synonymous variants (I), a higher ratio of the absolute counts of four homozygous-derived genotypes (LOF, deleterious, tolerated, and synonymous variants) (J), and a lower ratio of absolute counts of four heterozygous-derived genotypes (K)

ROH is generally used to detect inbreeding events. Long ROHs (≥ 1 Mb) typically result from recent inbreeding, whereas shorter ROHs indicate earlier inbreeding [27]. We observed that overall, the northern lineage harbored significantly higher inbreeding levels than the southern lineage (Fig. 4D, Additional file 2: Table S14). Specifically, for the shorter ROH tract lengths (50–100 kb, 100–500 kb, and 500 kb–1 Mb), the ROH numbers in the northern lineage were higher than those in the southern lineage, while for the longer ROH tract lengths (1–3 Mb, and 3–5 Mb), the ROH numbers were similarly small for both lineages (Fig. 4E, Additional file 2: Table S14). Similarly, both the mean cumulative fraction of individual genomes in longer ROH (≥ 1 Mb, 1–5 Mb, inbreeding that occurred approximately 45–225 years ago) were lower and statistically not significant for two lineages, while the mean cumulative fraction of individual genomes in shorter ROH (< 1 Mb, 50 kb–1 Mb, inbreeding that occurred approximately 225–4545 years ago) was significantly higher in northern lineage than southern lineage (Fig. 4F, Additional file 2: Table S14). These results demonstrated that the extent of inbreeding for the two snow leopard lineages was a consequence of their historically small population sizes rather than recent inbreeding. A significant increase of historical inbreeding in the northern lineage was consistent with its demographic history of multiple population bottlenecks. Overall, the northern lineage harbored a higher inbreeding coefficient (Froh) and a lower level of genome-wide heterozygosity compared to the southern lineage (Fig. 4G).

To further assess the genetic burden of two snow leopard lineages, we quantified the degree of harmfulness of the derived alleles and classified them into four mutation types: loss-of-function (LOF), deleterious nonsynonymous (DEL), tolerated nonsynonymous (TOL), and synonymous (SYN). We used the ratio of putatively deleterious to synonymous variants (LOF/SYN variants and DEL/SYN variants) to evaluate the genetic load of the two lineages [14, 28]. We found that for LOF/SYN variants, the ratio of heterozygous-derived genotypes was higher in the northern lineage than in the southern lineage, whereas the ratio of homozygous-derived genotypes was comparable in the two lineages (Fig. 4H, Additional file 2: Table S15). For DEL/SYN variants, the ratios of both homozygous and heterozygous-derived genotypes were higher in the northern lineage than in the southern lineage (Fig. 4I, Additional file 2: Table S15). Furthermore, we used the counts of all four mutation types to assess the genetic load for the two lineages [29]. In general, we found that the number of homozygous-derived genotypes was higher in the northern lineage for all mutation types, especially for the DEL mutation type being the most significant (Fig. 4J, Additional file 2: Table S15). However, the number of heterozygous-derived genotypes for all four mutation types was significantly depleted in the northern lineage (Fig. 4K, Additional file 2: Table S15), consistent with its lower genome-wide heterozygosity. Overall, these results suggested that the northern lineage demonstrated a higher genetic load of deleterious variants compared to the southern lineage.

In summary, the northern lineage displays a lower level of genome-wide heterozygosity and θπ as well as a higher level of LD, inbreeding, and genetic load than the southern lineage (Fig. 4A–K), which coincides with its demographic history of smaller population size.

Comparison of genomic diversity, inbreeding and deleterious variants among Carnivora species

To explore the genetic status of snow leopards, we performed comparative analyses of genome-wide heterozygosity, Froh, and deleterious variants with other 11 Carnivora species or subspecies (Additional file 2: Table S16). We observed that snow leopards exhibited extremely lower genomic heterozygosity (1.87E − 04), which is comparable to that of cheetahs (1.85E − 04), and harbored a relatively higher level of inbreeding (Froh = 0.217) than the other 11 Carnivora species or subspecies (Fig. 5A–C, Additional file 2: Table S17). A significantly negative correlation was observed between genome-wide heterozygosity and Froh among Carnivora species or subspecies (Fig. 5A). The “threatened-rank” group had a significantly higher inbreeding level and a significantly lower genome-wide heterozygosity level than the “nonthreatened-rank” group (Fig. 5B–C, Additional file 3: Fig. S11).

Fig. 5
figure 5

Genome-wide heterozygosity, inbreeding, and deleterious variation in snow leopards and other Carnivora species or subspecies. A A negative correlation was observed between Froh and genome-wide heterozygosity across 12 Carnivora species or subspecies. BC Snow leopards harbored an extremely low genome-wide heterozygosity, comparable to that of cheetahs, along with a moderately high level of inbreeding. Brown points represent the “Threatened group,” while blue points represent the “Non-threatened group.” DF Compared to other 11 Carnivora species or subspecies, snow leopards displayed a lower ratio of strongly deleterious variants (deleterious and LOF mutations) and a moderate ratio of weakly deleterious variants (tolerated mutations). G Strongly deleterious variants (DEL + LOF mutations) were to some extent purged in snow leopards by historical population bottlenecks and inbreeding. In snow leopards, the number of deleterious variants (DEL, LOF, and TOL mutations) was significantly lower in the ROH region compared to non-ROH region. Additionally, the number of strongly deleterious variants was lower than that of weakly deleterious variants (TOL mutations). H Snow leopards had significantly fewer deleterious variants (DEL, LOF, and TOL mutations) compared to the 11 other Carnivora species or subspecies, particularly for strongly deleterious variants within ROH regions

For a small population, the purifying selection of strongly deleterious mutations is an important way to maintain population health, such as in the cases of island fox [30], India tiger [31], and Iberian lynx [32]. Our reconstruction of the demographic history revealed that the snow leopard population has been decreasing since the Pleistocene glaciations (~ 0.7 Mya), especially in the northern lineage (Fig. 2A–C). To explore whether snow leopards have similar purifying patterns of deleterious mutations, we compared the mutational load of strongly deleterious variants (LOF and DEL mutations) and weakly deleterious variants (TOL mutation) in snow leopards with those in the other 11 Carnivora species or subspecies (Additional file 2: Table S18). Here, we used the ratio of deleterious to synonymous heterozygous variants as a proxy for the efficacy of purifying selection and genome-wide heterozygosity as a proxy for Ne [33]. For all three types of mutations, the ratio of deleterious to synonymous heterozygous variants was negatively correlated with Ne (Fig. 5D–F). For the DEL and LOF mutation types, snow leopards displayed a lower proportional burden of deleterious alleles among the 12 species or subspecies (Fig. 5D, E). In contrast, for the TOL mutation type, snow leopards had a higher proportional genetic burden among all species or subspecies (Fig. 5F, Additional file 2: Table S18). Similar trends also existed for homozygotes of the three types of mutations, which included variants that may be fixed in the species (Additional file 3: Fig. S12, Additional file 2: Table S18). These results suggested that strongly deleterious mutations (DEL and LOF mutations) have been substantially eliminated by purifying selection in snow leopards, but not weakly deleterious variants (TOL mutations).

To detect whether inbreeding (ROH) promoted the elimination of deleterious mutations in snow leopards, we counted the number of deleterious mutations in ROH and non-ROH regions. We observed that within snow leopards, the number of deleterious variants (DEL, LOF, and TOL mutations) was significantly lower in the ROH region than in the non-ROH region (Fig. 5G, Additional file 2: Table S19). For the ROH region, the counts of strongly deleterious variants (DEL and LOF mutations) were significantly lower than those of weakly deleterious variants (TOL mutations) (Fig. 5G, Additional file 2: Table S19). Furthermore, we observed that the number of deleterious variants (DEL, LOF, and TOL mutations) was significantly lower in snow leopards when compared with other 11 Carnivora species or subspecies, especially for the strongly deleterious variants of the ROH region (Fig. 5H, Additional file 2: Table S20). These results suggested that the strongly deleterious variants (DEL and LOF mutations) were partially purged in snow leopard by its historical inbreeding.

In summary, the above results demonstrated that although snow leopards had very low genome-wide heterozygosity and moderately high levels of inbreeding compared with the other 11 Carnivora species or subspecies in our study, strongly deleterious variants such as DEL and LOF mutations have been effectively purged by their historical population bottlenecks and inbreeding, which may be an important genetic mechanism for their population survival and viability.

Discussion

The snow leopard is the flagship and umbrella species of alpine mountain ecosystems in Asia. Its elusive habit and rare population size attract much more attention from conservationists and the public than most of the other felids. For a long time, it has remained controversial if there is subspecies differentiation in snow leopards. In our study, based on whole-genome analyses including population genetic structure and FST estimation, we identified two large genetic lineages (the southern lineage and the northern lineage) in snow leopards, which was supported by their biogeography. The two lineages diverged at approximately 25 Kya, which was similar to those between the tiger subspecies (27.60–33.83 Kya) [34]. However, because our sampling did not cover some important regions, such as the distribution ranges in Pakistan, India, and Kyrgyzstan, we did not define the two genetic lineages as two subspecies, which warrants further investigation with more comprehensive samples.

Considering the high dispersal ability and relative continuity of habitat distribution [35,36,37], determining the accurate boundary of the two genetic lineages is challenging. In our study, the four individuals from Tajikistan were genetically clustered with the individuals from Mongolia, Russia, and Xinjiang of China (Tianshan Mountains), together forming the northern lineage. Conversely, the southern lineage included snow leopard individuals from Tibet, Qinghai, Gansu, and Sichuan provinces of China. Therefore, we proposed that the biogeographic boundary of the two lineages was from the Gobi desert between Mongolia and China and the Taklimakan desert to the adjacent region between the Pamir and Kunlun Mountains and the adjacent region between the western Himalayas (Pamir–Karakorum) and central Himalayas (core Qinghai–Tibet Plateau/Himalayas) (Additional file 3: Fig. S7). Combining the distribution of suitable habitat (Fig. 2D–G) and results of Janecka et al. [8], the snow leopards from Pakistan and northwest India most likely belonged to the northern lineage, whereas individuals from Nepal most likely belonged to the southern lineage. The proposed distribution boundary for the two lineages was supported by other evidence. First, Li et al. [35, 36] and Li et al. [37] predicted the snow leopard habitat distribution and showed that our proposed distribution boundary was the region of obvious decrease in habitat suitability. Furthermore, our modeling of historically suitable habitats yielded similar results (Fig. 2D–G). Second, although our results did not support the viewpoint of the three subspecies proposed by Janecka et al. [8], our proposed distribution boundary matched well with the boundary between the “western subspecies” and the “central subspecies” proposed by Janecka et al. [8], consistently reflecting the high genetic differentiation on both sides of the boundary. Third, our proposed distribution boundary is also the boundary between the two subspecies of brown bear, the Himalayan brown bear and Tibetan brown bear [15] (Additional file 3: Fig. S8), and the boundary between the blue sheep and Siberian ibex (Additional file 3: Fig. S8), the main foods of snow leopards from the two lineages separately. However, additional samples around the distribution boundary, as well as detailed geographic and ecological data, are required to support conclusions regarding the distribution boundary.

The differentiation of the two lineages and maintenance of their current distribution pattern may be explained by two factors. On the one hand, there are some large rivers between the western and central Himalayan mountains that flow from north to south. Because snow leopards often dwell in alpine mountains, they are less likely to descend to low-altitude valleys and cross large rivers. In low-altitude valleys, snow leopards may suffer from the risks of unsuitable habitat and potential competition from Bengal tigers and leopards. On the other hand, historically, under the influence of the uplift of the Qinghai–Tibet Plateau, the Pamir–Karakorum region was mainly affected by westerlies, with a dry/cold climate and low precipitation, while the core Qinghai–Tibet Plateau/Himalayas was mainly affected by the South-Asia and East-Asia monsoons, with a relatively humid climate and high precipitation [38]. The primary importance of precipitation parameters during certain periods of the year was demonstrated when modeling habitats for the Russian part of the range [39]. Because of the effects of the last glaciations, especially LGM, the ancestral populations of snow leopards in these two regions might have experienced isolation and divergence at approximately 25 Kya. After the LGM, there were significant precipitation and temperature differences between the two regions affected by different monsoons. The differences in the local microclimate further hindered the exchange between the two lineages because of their already-established local adaptations to their respective environments.

In this study, we also found that the Xinjiang individuals of the Tianshan Mountains clustered with the Mongolian and Russian individuals from the Altai Mountains, rather than the Tajikistan snow leopards (Fig. 1B, Additional file 3: Fig. S4), which was different from the result of Janecka et al. [8] (the Tianshan individuals clustered with Tajikistan individuals). Furthermore, we observed two sublineages within the northern lineage: Tajikistan and Mongolia–Russia; however, the snow leopards of the Tianshan Mountains demonstrated recent admixture between Tajikistan and Mongolia–Russia populations (Additional file 3: Fig. S5), which was consistent with the result of Korablev et al. [13], who reported that Kyrgyzstan snow leopards showed a recent admixture between Tajikistan and Mongolia–Russia populations. These results suggest that the Tianshan population might be genetically attributed to the recent admixture of the relatively older Tajikistan and Mongolia–Russia populations. This also explains the inconsistency that in our study, the Tianshan individuals clustered with Mongolian–Russian individuals, whereas in Janecka et al. [8], the Tianshan individuals clustered with Tajikistan individuals. Provided that the genetic divergence between the northern and southern lineages was driven by LGM, the Himalayas–Hengduan Mountains, Pamir Plateau, and the Altai Mountains were likely glacial refugia for snow leopards, and the Tianshan Mountains were the postglacial dispersal and admixture area of snow leopards from the Pamir Plateau and Altai Mountains. Also, it is important to note that although our samples were from four countries and covered a large part of the snow leopard’s distribution ranges, some distribution ranges remained uncovered. We believe that future studies with comprehensive sampling in these gaps regions will provide more insights into the population genetic structure and even subspecies classification of global snow leopards.

Our results demonstrated that snow leopards exhibited an exceedingly low genetic diversity and a high level of inbreeding; meanwhile, those strongly deleterious variants (DEL and LOF mutations) were effectively purged by their historical population bottlenecks and inbreeding. The results were opposite to those of cheetahs whose genetic diversity was the lowest, but the strongly deleterious variants have not been effectively eliminated and remained at a high level (Fig. 5H). The purging of strongly deleterious variants in snow leopards may be an important genetic mechanism for its population survival and viability despite its very low genetic diversity. However, snow leopards still face high survival risks because of increasing impacts of global climate change, human disturbance, and habitat fragmentation, and require urgent and continuous protection.

Conclusions

Our study is the most comprehensive population genomics study of snow leopards up to date, providing novel insights into the genetic evolutionary potential and population viability of this vulnerable and enigmatic big cat. The findings reveal the genetic divergence of the two lineages and the most likely distribution boundary. We observed extremely lower genomic diversity and higher inbreeding levels in snow leopards compared to other Carnivora species or subspecies, highlighting that snow leopards have low genetic evolutionary potential. Notably, we also discovered that compared to other Carnivora species or subspecies, those strongly deleterious variants in snow leopards have been largely purged because of their historical population bottlenecks and inbreeding, which may be an important genetic mechanism for their population survival and persistence. These findings advance our understanding of evolution and conservation of snow leopards beyond the previous microsatellite-based conservation genetics study.

Our findings have crucial implications for conservation. First, considering the high dispersal capacity of snow leopards, the recognition of two large genetic lineages suggests two large conservation management units, which is different from the prediction of the habitat patches. The habitat restoration and corridor construction should be implemented preferentially within each large conservation management unit. Because of the low genetic diversity and fragmented habitat patches, enhancing habitat connectivity is critical to facilitating gene flow, recovering the genetic diversity of wild populations, and avoiding the formation of isolated small populations. Second, it is feasible to perform “genetic rescue” through the translocation of rescued wild snow leopards to other isolated habitat patches, aiming to increase the genetic diversity of isolated populations. However, since the southern lineage or individuals living in the high-elevation area have evolved the adaptation to high altitude, possibly through EPAS1 mutations, care should be taken not to translocate the rescued snow leopards between the high-elevation and low-elevation habitats of the two lineages. Meanwhile, translocation should be implemented within each large genetic lineage, rather than between the two lineages, to avoid the impacts on the local adaptation of each lineage. Third, compared with other felids, the extremely low genomic diversity and the higher inbreeding level in snow leopards show lower genetic evolutionary potential when encountering future severe environmental changes, highlighting that the “threatened rank” of snow leopards should be updated from “Vulnerable” to “Endangered” on the IUCN Red List, with the integration of genetic diversity into the assessment of species Red List. Finally, because the snow leopard habitats involve 12 countries, transboundary international cooperation is required for protection and recovery of fragmented habitats and populations [40]. In addition, the identification of two lineages and their distribution boundaries provide further insight into the biogeography of the Qinghai–Tibet Plateau, suggesting the important roles of Pleistocene glaciations and differences in local climate in intraspecies genetic differentiation of mammals in this biodiversity hotspot.

Methods

Sample collection and genome sequencing

We collected tissue samples from a dead adult wild female snow leopard in 2021 from Qinghai Province. In order to obtain a high-quality reference genome from the snow leopard, we performed three sequencing technologies, including Illumina HiSeq X10, Nanopore PromethION, and Hi-C. Specifically, for Illumina HiSeq X10 sequencing, we generated clean read data with a depth of ~ 116 × . For Nanopore PromethION sequencing, we constructed seven long-read libraries and obtained read data passed with a depth of ~ 112 × . For Hi-C sequencing, we generated clean paired-end-read data with a depth of ~ 254 × (Additional file 2: Table S21). For detailed information on genome sequencing, genome assembly, genome annotation, and assembly evaluation (Additional file 2: Tables S21–S27) [41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56], see the Extended Results section of Additional file 1.

During the past nearly 10 years, we collected 52 wild snow leopard blood, muscle, and skin samples, including 34 samples with geographical locations and 18 samples from unknown sources (Additional file 2: Table S1). The 34 samples with geographic sources were collected from four countries: China (n = 22), Mongolia (MG, n = 5), Russia (RU, n = 3), and Tajikistan (TA, n = 4). The 22 Chinese samples were collected from six provinces, including Qinghai (QH, n = 13), Gansu (GS, n = 1), Sichuan (SC, n = 1), Tibet (TB, n = 4), Xinjiang (XJ, n = 2), and Inner Mongolia (IM, n = 1) (Fig. 1A, Additional file 2: Table S1). Among all 52 samples, there were 49 modern samples and 3 historical samples (Additional file 2: Table S1). Genomic DNA from 49 modern samples was extracted using the DNeasy Blood & Tissue Kit (Qiagen, Valencia, CA, USA). For each individual, we constructed genomic libraries with insert sizes from 200 to 500 bp and performed genome resequencing at an average depth of 21.9 × using the Illumina Nova-PE150 sequencing platform (Additional file 2: Table S2). The published whole-genome resequencing data of a modern snow leopard from west Mongolia [9] was also used in this study. The genome data of 50 modern snow leopard individuals were used for downstream analysis. For three historical samples, the procedures for DNA extraction, library construction, and genome sequencing followed the methods of Dabney et al. [57] and Wang et al. [58]. The detailed experiment procedures for historical samples were presented in the Extended Materials & Methods section of Supplementary Materials.

Whole-genome SNP calling and filtering

To identify population-level SNPs of the snow leopard, raw data for each of 50 modern snow leopard individuals were filtered with fastp [59]. Then, using BWA-MEM v0.7.8 [60], the filtered high-quality reads were aligned with our snow leopard reference genome. Samtools v1.9 was used for classification of the mapped reads [61], and “MarkDuplicates” in Picard v2.2 (http://broadinstitute.github.io/picard) was used for marking and removing potential PCR duplications. Variant calling of 50 individuals was performed using HaplotypeCaller in GATK v4.1.2.0 [62] with the option “-ERC GVCF.” We combined all 50 individual GVCF files and obtained the raw candidate SNP sets using modules of CombineGVCFs, GenotypeGVCFs, and SelectVariants in GATK. GATK generated the gatk-filtered SNP set with the criteria of “QD < 2.0, MQ < 40.0, FS > 60.0, SOR > 3.0, MQRankSum < − 12.5, and ReadPosRankSum < − 8.0” based on the raw candidate SNP sets. To obtain high-quality population-level SNP sets of snow leopards, we further filtered the gatk-filtered SNP set with options of “minDP < 4, maf < 0.05, and max-missing > 0.2” using VCFtools v0.1.17 [63]. Those non-biallelic SNPs were also removed. Following the above mentioned strict filtering, the remaining high-quality SNP sets were used for downstream population genomics analysis.

Population genetic structure analyses

We reconstructed the genetic structure of snow leopards based on the whole-genome SNP set. For PCA, we used the genome-wide Complex Trait Analysis tool v1.93.0 [64] to calculate the eigenvalues and eigenvectors that were further visualized in R. To analyze the genetic components, we calculated and assumed genetic groups K = 1 to K = 5 using ADMIXTURE v1.3.0 [65], and the case with the minimum cross-validation error was considered as the optimal one. We constructed the phylogenetic tree of snow leopards using the neighbor-joining and maximum likelihood methods implemented in MEGA v10 [66] and RAxML (v 8.2.12) [67] with 1000 iterations and the tiger as the outgroup. The phylogenetic tree results were further polished using iTOL v5 [68].

Analysis of historical samples

In this study, we collected five historical snow leopard samples from the museums (1960–1978). Because genomic DNA degradation might cause the genotype calling bias, for these historical samples, we used pileupCaller (GitHub-stschiff/sequenceTools) to perform random allele haploid calling with the option of “-majorityCall -downSampling -minDepth 3” based on the “high confidence” SNP set of 50 modern individuals. To test the reliability of the haploid calling method, we made diploid calls with GATK (see above variants calling pipeline) for one modern sample (SC1) and found 99.7% similarity in homozygous genotype calls between the two approaches. After the data quality control, two historical samples were removed due to very low data quality and quantity. We merged the SNP set of three historical samples (average coverage: 0.13 ~ 0.55) with those of 50 modern samples (Additional file 2: Tables S1–S2) and constructed a maximum likelihood tree using MEGA v10 to further assess their phylogenetic relationship (Additional file 3: Fig. S4).

Population genetic divergence (FST) between two lineages

We used a sliding window method to calculate the genome-wide mean FST values between two snow leopard lineages in VCFtools with the parameters of 50 kb window size and 25 kb window step.

Demographic history simulation

We used PSMC to estimate the changes in the effective population size with the parameters of “ − N 30 –t 15 –r 5 − p 4 + 25*2 + 4 + 6.” The individuals selected for the PSMC analysis come from different geographical populations as much as possible. We selected no less than 10 individuals for both southern and northern lineages, and the sequencing depth of the selected individuals was no less than 18 × (Additional file 2: Table S2). The nucleotide mutation rate (μ) of the snow leopard was set to 4.94 × 10−9 mutations per site per generation which calculated from the whole-genome comparison results of snow leopard and cat [69]. The generation time was set as 5 years that was twice of the sexual maturity age of snow leopards [1, 14]. Additionally, we used stairway-plot2 to simulate the relative recent demographic history for the two lineages. Individuals from each lineage that were subjected to PSMC analysis were used to calculate the folded SFS by easySFS (Additional file 2: Table S2). The SFS files of two lineages were further used for the estimation of Ne with the parameters (μ and g) identical to those of PSMC.

Divergence histories

We used Fastsimcoal27 [19, 20] to infer the divergence history of the two lineages with the following parameters: -m -n 100,000 −0 -L 150 -s 0 -M -q -C -B60 -c 10. The divergence model of the two lineages was simulated 500 times to ensure convergence, and the run with the highest likelihood was considered the best model. For the input SNP set, we only retained SNPs in the intergenic regions and those sites with depth < 10 and max-missing > 0.2 were filtered. Individuals from two lineages and the parameters of μ and g were the same as the PSMC analysis (Additional file 2: Table S2).

Habitat dynamics of snow leopards

We used MaxEnt v3.4.4 to simulate the habitat dynamics of snow leopards based on 719 occurrence record data derived from field surveys, literature sources, the Global Biodiversity Information Facility database, and the iNaturalist database. To ensure the reliability of data sources, we selected only accurate GPS location points (error < 15 m) from various data sources and excluded other points within a 5-km radius [36]. A stringent screening process was applied, including limiting the time frame to post-2010, focusing on potential snow leopard distribution areas, removing the locations of specimens and outliers beyond the potential distribution areas, and retaining only survey or sample-related occurrence points. To control potential biases, duplicate records within individual grids were deleted. Subsequently, a kernel density distribution layer was created for each occurrence point using a distance of 30 km as the bias layer in the MaxEnt model [35], thereby reducing sampling bias based on sampling intensity. We selected five bioclimatic variables (elevation, ruggedness, bio10, bio11, and bio12) for modeling [35]. Gridded bio10, bio11, bio12, and elevation variables (2.5 arc minutes) for contemporary conditions were downloaded from WorldClim 2.1. Ruggedness was derived from elevation data using the VRM tool in ArcMap v10.8. To mitigate sampling bias, we calculated the kernel density of occurrence points within a 30-km radius in ArcMap 10.8, generating a bias layer for the MaxEnt model. We applied the species distribution model to multiple general circulation models (GCMs) of the LGM (22 Kyr ago), the Mid Holocene (6 Kyr ago), and the Last Interglacial (LIG, 120–140 Kyr ago). For improved climate prediction accuracy, we used nine Mid Holocene GCMs and three LGM GCMs from different institutions in WorldClim 1.4, and to prevent overfitting, we selected regularization multipliers of 0.25, 0.5, 1, 2, and 4 for modeling. The model performance was evaluated using the area under the receiver operating characteristic curve (AUC). The results obtained from different GCMs and regularization multipliers were subsequently averaged to construct the ensemble model. We used the “equal training sensitivity and specificity” as the habitat suitability [70] and the equal interval method to classify habitat suitability zones. Distribution models were plotted using ArcMap v10.8.

Genomic selection signatures of local adaptation between two lineages

To detect local adaptation selective signals between two snow leopard lineages, a sliding window method (FSTπ) was employed with a genome-wide 50 kb sliding window size and the 25 kb window step implemented in VCFtools. We applied the Z transformation for the FST values and log2 transformation for θπ ratios. Overlapping windows between the top 5% of Z(FST) and the top 5% log2π ratio) were considered as candidate selective windows that were further assigned to the corresponding SNPs and genes. Functional enrichment analysis of candidate genes was performed using Metascape v1.0 [71]. To further validate the positive selection of the EPAS1 gene in the southern lineage, we conducted additional selection analyses using two linkage-based estimation methods: XP-CLR [72] and XP-EHH [73].

Estimation of genomic diversity, LD and ROH in snow leopards

In order to calculate the genomic heterozygosity of snow leopards, we first genotyped and filtered the SNP set for each modern individual’s gvcf file using GATK. After obtaining the gatk-filtered SNP set for each individual, we counted the number of heterozygotes (0/1) for each snow leopard individual, and the ratio of the number of heterozygotes to genome size was used to represent the genome-wide heterozygosity. For population-level genetic diversity estimation, we calculated the nucleotide diversity (θπ) with parameters of “–window-pi 50 Kb –window-pi step 25 kb” using VCFtools. The genome-wide LD of each lineage was assessed using PopLDdecay [74] with default settings.

For the inbreeding coefficient estimation of snow leopards, we first obtained the gatk-filtered SNP set for each individual. Then, we calculated ROH per individual based on the gatk-filtered SNP set with options of “–homozyg –homozyg-window-snp 20 –homozyg-snp 20” using Plink v1.9 [75]. Here, we counted five different length types of ROH (50 ~ 100 kb, 100 ~ 500 kb, 500 kb ~ 1 Mb, 1 ~ 3 Mb, and 3 ~ 5 Mb) for quantitative analysis of inbreeding coefficient. A ROH length greater than 1 Mb represents recent inbreeding, while a ROH length less than 1 Mb represents long-term earlier inbreeding. The approximate generation time (g) of ROH occurrence was estimated using the following formula [76]: g = 100/(2rLROH), where r represents the recombination rate which was set to 1.1 cM per Mb [77], and LROH is the length of ROH. The genomic inbreeding coefficient (Froh) [78], which is the total length of ROH divided by the snow leopard genome size, was used to evaluate the inbreeding level of each snow leopard individual.

Assessment of the genetic load in snow leopards

We used SnpEff v5.1d [79] to annotate the whole-genome SNP set of all 50 modern individuals and obtained three types of coding variants including LOF, missense, and synonymous (SYN) variants. The LOF mutations here include introducing premature stop codons and disrupting splice sites. The missense mutations were further classified as deleterious (DEL) or tolerated (TOL) missense mutations using SIFT 4G [80]. To determine the ancestral and derived alleles of each variant, we first aligned our snow leopard reference genome to the cat genome (GCA_016509815.2) using LAST v802 [81]. Specifically, variant alleles same to cat were considered putatively ancestral alleles, while alleles different from cat were considered putatively derived alleles. For the genetic load assessment of snow leopards, we used two statistical methods for each individual. One method counted the number of homozygous and heterozygous-derived genotypes of DEL and LOF, respectively [29, 82], whereas another method calculated the ratio of the number of homozygous and heterozygous-derived genotypes of DEL and LOF to the number of homozygous and heterozygous-derived genotypes of SYN [14, 28].

Comparative analyses among Carnivora species

For comparative analyses of the genomic heterozygosity, inbreeding, and genetic load between snow leopards and other carnivores, we collected the genome data of 11 Carnivora species or subspecies (Additional file 2: Table S16) from NCBI and CNGBdb (China National GeneBank DataBase). Specifically, the data set for each species or subspecies should include an assembled reference genome, a genome annotation GFF file, and at least one whole-genome resequencing SRA data. We then performed whole SNP calling using the same methods as those used for snow leopards and obtained the gatk-filtered SNP set for each species. Based on the gatk-filtered SNP set, we further calculated the genomic heterozygosity and inbreeding coefficient (ROH length ≥ 50 kb) for each species or subspecies using the same statistical method as those used for snow leopards. For some species or subspecies with SRA data from multiple individuals, the value used for these species or subspecies was the average value calculated from all individuals. For snow leopards, the average value used for the comparison analysis was calculated from all 50 modern individuals.

For comparative analysis of genetic load between snow leopards and 11 other Carnivora species or subspecies, we first used SIFT4G with the UniRef90 protein database to develop custom SIFT databases for each species, following the developer’s recommendations. Meanwhile, we used SnpEff v5.1d to build the SnpEff database for each species based on their reference genome and genome annotation file. Based on these two databases, we further annotated the gatk-filtered SNP set for each species and identified four variant types of coding regions: DEL, LOF, TOL, and SYN mutations. The putative LOF mutations here also include introducing premature stop codons and disrupting splice sites.

To quantify the genetic load, we used the number of synonymous variants as a “neutral” baseline to compare the relative genetic burden of different types of variants between species [33]. Specifically, we calculated the ratio of three variant types (DEL, LOF, and TOL mutations) to synonymous variants (Fig. 5D–F). This method accounted for different evolutionary rates among species and corrected for branch length variation because of varying degrees of divergence between species, making it an appropriate approach for assessing the effects of long-term Ne on the relative burden of different variant types among species [33]. To detect whether inbreeding (ROH) may promote the elimination of deleterious mutations for snow leopards, for all 12 Carnivora species or subspecies, we calculated the ratios of number of deleterious mutations in ROH or non-ROH regions to the length of ROH or non-ROH.

Data availability

Raw data associated with this study has been deposited and released in National Genomics Data Center, China National Center for Bioinformation/Beijing Institute of Genomics, Chinese Academy of Sciences (https://ngdc.cncb.ac.cn), with the Bioproject accession of PRJCA022406 [83] (https://ngdc.cncb.ac.cn/bioproject/browse/PRJCA022406). Specifically, the reference genome raw sequencing data including Nanopore, Hi-C, and Illumina reads have been deposited in the Genome Sequence Archive (GSA: CRA022084) [84]. The whole genome resequencing raw data of snow leopard samples reported in this study have been deposited in the Genome Sequence Archive (GSA: CRA021509) [85]. The reference genome assembly of snow leopard has been deposited in the NCBI with BioProject of PRJNA1234234 under the accession GCA_048603155.1 [86]. In this study, no other scripts and softwares were used besides those mentioned in the Methods section.

References

  1. McCarthy T, Mallon D, Jackson R, Zahler P, McCarthy K. Panthera uncia. The IUCN Red List of Threatened Species. 2017;2017.e.T22732A50664030.

  2. Riordan P, Cushman SA, Mallon D, Shi K, Hughes J. Predicting global population connectivity and targeting conservation action for snow leopard across its range. Ecography. 2016;39:419–26.

    Article  Google Scholar 

  3. Bhatti UA, Nizamani MM, Mengxing H. Climate change threatens Pakistan’s snow leopards. Science. 2022;377:585–6.

    Article  CAS  PubMed  Google Scholar 

  4. Mahmood T, et al. Range contraction of snow leopard (Panthera uncia). PLoS One. 2019;14:e0218460.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Farrington JD, Tsering D. Human-snow leopard conflict in the Chang Tang region of Tibet China. Biol Conserv. 2019;237:504–13.

    Article  Google Scholar 

  6. Ahmad S, Nabi G, Hacker CE, Strelnikov II, Luan X. Increasing threats to snow leopard survival in Pakistan. Front Ecol Evol. 2022;10:818798.

    Article  Google Scholar 

  7. Kitchener A, et al. A revised taxonomy of the Felidae: the final report of the Cat Classification Task Force of the IUCN Cat Specialist Group. Cat News. 2017;11:1–80.

  8. Janecka J, et al. Range-wide snow leopard phylogeography supports three subspecies. J Hered. 2017;108:597–607.

    Article  PubMed  Google Scholar 

  9. Cho Y, et al. The tiger genome and comparative analysis with lion and snow leopard genomes. Nat Commun. 2013;4:1–7.

    Article  CAS  Google Scholar 

  10. Armstrong E, et al. Genome report: chromosome-level draft assemblies of the snow leopard, African leopard, and tiger (Panthera uncia, Panthera pardus pardus, and Panthera tigris). G3. 2022;12:p.jkac277.

    Article  Google Scholar 

  11. Figueiró H, et al. Genome-wide signatures of complex introgression and adaptive evolution in the big cats. Sci Adv. 2017;3:e1700299.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Janecka J, et al. Noninvasive genetics and genomics shed light on the status, phylogeography, and evolution of the elusive snow leopard. Conserv Genet Mammals Integr Res Novel Approach. 2020;5:83–120.

    Article  Google Scholar 

  13. Korablev M, et al. Large-scale and fine-grain population structure and genetic diversity of snow leopards (Panthera uncia Schreber, 1776) from the northern and western parts of the range with an emphasis on the Russian population. Conserv Genet. 2021;22:397–410.

    Article  CAS  Google Scholar 

  14. Yang L, et al. Evolutionary conservation genomics reveals recent speciation and local adaptation in threatened takins. Mol Biol Evol. 2022;39:msac111.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Lan T, et al. Evolutionary history of enigmatic bears in the Tibetan Plateau-Himalaya region and the identity of the yeti. Proc R Soc B. 2017;284:1868.

  16. Li H, Durbin R. Inference of human population history from individual whole-genome sequences. Nature. 2011;475:493–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Liu X, Fu YX. Stairway Plot 2: demographic history inference with folded SNP frequency spectra. Genome Biol. 2021;21:1–9.

    CAS  Google Scholar 

  18. Santiago E, et al. Recent demographic history inferred by high-resolution analysis of linkage disequilibrium. Mol Biol Evol. 2020;37:3642–53.

    Article  CAS  PubMed  Google Scholar 

  19. Excoffier L, Dupanloup I, Huerta-Sánchez E, Sousa VC, Foll M. Robust demographic inference from genomic and SNP data. PLoS Genet. 2013;9:e1003905.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Excoffier L, et al. fastsimcoal2: demographic inference under complex evolutionary scenarios. Bioinformatics. 2021;37:4882–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Phillips SJ, Anderson RP, Schapire RE. Maximum entropy modeling of species geographic distributions. Ecol Model. 2006;190:231–59.

    Article  Google Scholar 

  22. McCarthy TM, Chapron G. Snow leopard survival strategy. International Snow Leopard Trust and Snow Leopard Network: Seattle; 2003. p. 105.

    Google Scholar 

  23. Network SL. Snow leopard survival strategy. Seattle Washington USA. 2014;1:145.

    Google Scholar 

  24. Peng Y, et al. Down-regulation of EPAS1 transcription and genetic adaptation of Tibetans to high-altitude hypoxia. Mol Biol Evol. 2017;34:818–30.

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Hu Y, et al. Molecular mechanisms of adaptive evolution in wild animals and plants. Sci China Life Sci. 2023;66:453–95.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Kaessmann H, et al. Extensive linkage disequilibrium in small human populations in Eurasia. Am J Human Genet. 2002;70:673–85.

    Article  CAS  Google Scholar 

  27. Kirin M, et al. Genomic runs of homozygosity record population history and consanguinity. PLoS One. 2010;5:e13996.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Hu Y, et al. Genomic evidence for two phylogenetic species and long-term population bottlenecks in red pandas. Sci Adv. 2020;6:eaax5751.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Nigenda-Morales S, et al. The genomic footprint of whaling and isolation in fin whale populations. Nat Commun. 2023;14:5465.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Robinson JA, Brown C, Kim BY, Lohmueller KE, Wayne RK. Purging of strongly deleterious mutations explains long-term persistence and absence of inbreeding depression in island foxes. Curr Biol. 2018;28:3487–94.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Khan A, et al. Genomic evidence for inbreeding depression and purging of deleterious genetic variation in Indian tigers. Proc Natl Acad Sci. 2021;118:e2023018118.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Kleinman-Ruiz D, et al. Purging of deleterious burden in the endangered Iberian lynx. Proc Natl Acad Sci. 2022;119:e2110614119.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Robinson J, et al. The critically endangered vaquita is not doomed to extinction by inbreeding depression. Science. 2022;376:635–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Liu Y, et al. Genome-wide evolutionary analysis of natural history and adaptation in the world’s tigers. Curr Biol. 2018;28:3840–9.

    Article  CAS  PubMed  Google Scholar 

  35. Li J, et al. Climate refugia of snow leopards in High Asia. Biol Conserv. 2016;203:188–96.

    Article  Google Scholar 

  36. Li J, et al. Defining priorities for global snow leopard conservation landscapes. Biol Conserv. 2020;241:108387.

    Article  Google Scholar 

  37. Li X, et al. Potential range shift of snow leopard in future climate change scenarios. Sustainability. 2022;14:1115.

    Article  Google Scholar 

  38. Yao T, et al. Different glacier status with atmospheric circulations in Tibetan Plateau and surroundings. Nat Clim Change. 2012;2:663–7.

    Article  Google Scholar 

  39. Kalashnikova YA, Karnaukhov AS, Dubinin MY, Poyarkov AD, Rozhnov VV. Potential habitat of snow leopard (Panthera uncia, Felinae) in South Siberia and adjacent territories based on the maximum entropy distribution model. Зooлoгичecкий жypнaл. 2019;98:332–42.

    Google Scholar 

  40. Poyarkov A, et al. Assurance of the existence of a trans-boundary population of the snow leopard (Panthera uncia) at Tsagaanshuvuut-Tsagan-Shibetu SPA at the Mongoli-Russia border. Integr Zool. 2020;15:224–31.

    Article  PubMed  Google Scholar 

  41. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Seppey M, Manni M, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness. Gene Predict Methods Protoc. 2019;1962:227–45.

    CAS  Google Scholar 

  43. Wang X, Wang L. GMATA: an integrated software package for genome-scale SSR mining, marker development and viewing. Front Plant Sci. 2016;7:1350.

    PubMed  PubMed Central  Google Scholar 

  44. Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27:573–80.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Han Y, Wessler SR. MITE-Hunter: a program for discovering miniature inverted-repeat transposable elements from genomic sequences. Nucleic Acids Res. 2010;38:e199.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Xu Z, Wang H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007;35:W265–8.

    Article  PubMed  PubMed Central  Google Scholar 

  47. Bedell JA, Korf I, Gish W. MaskerAid: a performance enhancement to RepeatMasker. Bioinformatics. 2000;16:1040–1.

    Article  CAS  PubMed  Google Scholar 

  48. Sirén J, Välimäki N, Mäkinen V. Indexing graphs for path queries with applications in genome research. IEEE/ACM Trans Comput Biol Bioinf. 2014;11:375–88.

    Article  Google Scholar 

  49. Kovaka S, et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 2019;20:1–13.

    Article  Google Scholar 

  50. Haas B, et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013;8:1494–512.

    Article  CAS  PubMed  Google Scholar 

  51. Haas B, et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 2003;31:5654–66.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Stanke M, et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 2006;34:W435–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol. 1997;268:78–94.

    Article  CAS  PubMed  Google Scholar 

  54. Keilwagen J, et al. Using intron position conservation for homology-based gene prediction. Nucleic Acids Res. 2016;44:e89.

    Article  PubMed  PubMed Central  Google Scholar 

  55. Haas B, et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 2008;9:1–22.

    Article  Google Scholar 

  56. Wu C, et al. The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res. 2009;34:D187–91.

    Article  Google Scholar 

  57. Dabney J, et al. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments. Proc Natl Acad Sci. 2013;110:15758–63.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Wang G, et al. Genomic approaches reveal an endemic subpopulation of gray wolves in Southern China. iScience. 2019;20:110–8.

    Article  PubMed  PubMed Central  Google Scholar 

  59. Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884–90.

    Article  PubMed  PubMed Central  Google Scholar 

  60. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Li H, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9.

    Article  PubMed  PubMed Central  Google Scholar 

  62. McKenna A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Danecek P, et al. The variant call format and VCFtools. Bioinformatics. 2011;27:2156–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet. 2011;88:76–82.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009;19:1655–64.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol. 2018;35:1547.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Letunic I, Bork P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49:W293–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Kondrashov AS, Crow JF. A molecular approach to estimating the human deleterious mutation rate. Hum Mutat. 1993;2:229–34.

    Article  CAS  PubMed  Google Scholar 

  70. Liu C, White M, Newell G. Selecting thresholds for the prediction of species occurrence with presence-only data. J Biogeogr. 2013;40:778–89.

    Article  Google Scholar 

  71. Zhou Y, et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun. 2019;10:1523.

    Article  PubMed  PubMed Central  Google Scholar 

  72. Chen H, Patterson N, Reich D. Population differentiation as a test for selective sweeps. Genome Res. 2010;20:393–402.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Sabeti P, et al. Genome-wide detection and characterization of positive selection in human populations. Nature. 2007;449:913–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Zhang C, Dong SS, Xu JY, He WM, Yang TL. PopLDdecay: a fast and effective tool for linkage disequilibrium decay analysis based on variant call format files. Bioinformatics. 2019;35:1786–8.

    Article  CAS  PubMed  Google Scholar 

  75. Purcell S, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–75.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Thompson EA. Identity by descent: variation in meiosis, across genomes, and in populations. Genetics. 2013;194:301–26.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Dumont BL, Payseur BA. Evolution of the genomic rate of recombination in mammals. Evolution. 2008;62:276–94.

    Article  CAS  PubMed  Google Scholar 

  78. Zhang Q, Calus MP, Guldbrandtsen B, Lund MS, Sahana G. Estimation of inbreeding using pedigree, 50k SNP chip genotypes and full sequence data in three cattle breeds. BMC Genet. 2015;16:1–11.

    Article  Google Scholar 

  79. Cingolani P, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly. 2012;6:80–92.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Vaser R, Adusumalli S, Leng SN, Sikic M, Ng PC. SIFT missense predictions for genomes. Nat Protoc. 2016;11:1–9.

    Article  CAS  PubMed  Google Scholar 

  81. Kiełbasa SM, Wan R, Sato K, Horton P, Frith MC. Adaptive seeds tame genomic sequence comparison. Genome Res. 2011;21:487–93.

    Article  PubMed  PubMed Central  Google Scholar 

  82. Robinson J, et al. Genomic flatlining in the endangered island fox. Curr Biol. 2016;26:1183–9.

    Article  CAS  PubMed  Google Scholar 

  83. Yang, L. et al. Genomic evidence for low genetic diversity but purging of strong deleterious variants in snow leopards. Dataset associated with this study. 2025. https://ngdc.cncb.ac.cn/bioproject/browse/PRJCA022406.

  84. Yang, L. et al. Genomic evidence for low genetic diversity but purging of strong deleterious variants in snow leopards. Reference genome raw sequencing data including Nanopore, Hi-C, and Illumina reads. Genome Sequence Arch. 2025. https://ngdc.cncb.ac.cn/gsa/browse/CRA022084.

  85. Yang, L. et al. Genomic evidence for low genetic diversity but purging of strong deleterious variants in snow leopards. Whole genome resequencing raw data of 52 snow leopards. Genome Sequence Arch. 2025. https://ngdc.cncb.ac.cn/gsa/browse/CRA021509.

  86. Yang, L. et al. Genomic evidence for low genetic diversity but purging of strong deleterious variants in snow leopards. Reference genome assembly of snow leopard. NCBI. 2025. https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1234234.

Download references

Acknowledgements

We thank Tingting Yin, Weikang Yang, Feng Xu, Ziling Gong, Puchi Dawa, Duifang Ma, Sheng Li, Shiping Mao, Yonglei Lv, Yanlin Liu, Xinming Lian, Wei Wei, Song Li, Dongdong Wu, Guowen Wang and Wei Wang for help with sample collection, Huiwen Liu and Yubo Hao for help with DNA extraction, Xinhai Li for providing the part of occurrence records of snow leopards, and Juan Li for discussing the population genetic structure together.

Peer review information

Wenjing She was the primary editor of this article and managed its editorial process and peer review in collaboration with the rest of the editorial team. The peer-review history is available in the online version of this article.

Funding

This study was supported by the National Key Program of Research and Development of Ministry of Science and Technology (2023YFF1304800 and 2022YFF130150), the National Natural Science Foundation of China (32325010 and U24A20361), the Science and Technology Project of Xizang Autonomous Region, China (XZ202402ZD0005), the Third Xinjiang Scientific Expedition Program (2021xjkk0600), the Second Tibetan Plateau Scientific Expedition and Research Program (2019QZKK0501), Qinghai Science and Technology Major Program “Sanjiangyuan National Park Animal Genome Project,” Joint Grant from Chinese Academy of Sciences—People’s Government of Qinghai Province on Sanjiangyuan National Park (LHZX-2020–01), and Russian Geographical Society Grant (No. 49_2022-I_RGO). This study was also supported by the Animal Branch of the Southwest China Germplasm Bank of Wildlife, Chinese Academy of Sciences (the Large Research Infrastructure Funding), the National Animal Collection Resource Center, Institute of Zoology, Chinese Academy of Sciences, and the Joint Russian-Mongolian complex biological expedition of the RAS and ASM.

Author information

Authors and Affiliations

Authors

Contributions

Y.H. and F.W. conceived and supervised the project. Q.Y., A.P., M.K., V.R., J.A.H-B., X.Z., L.Y., D.A., Bar.M., Bay.M., L.M., S.M., S.H., Y.J., T.Z., G.W., Y.S. and Y.H. collected the samples and provided the occurrence records of snow leopards. L.Y., H.J., J.S., W.C. and X.D. performed the data analyses. Q.F. and Q.D. performed the DNA extraction and library construction of historical samples. Y.H., F.W., L.Y., and J.S. discussed the analysis results. Y.H., L.Y., H.J. and A.P. wrote and revised the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Fuwen Wei or Yibo Hu.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the Institutional Animal Care and Use Committee, Institute of Zoology, Chinese Academy of Sciences, China (IOZ-IACUC-2020–116).

Competing interests

Bariushaa Munkhtsog is a member of Irbis Mongolia Center and declares no competing interests in this study. The other authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

13059_2025_3555_MOESM1_ESM.docx

Additional file 1. Contains extended results and methods for snow leopard genome assembly and annotation, as well as the DNA extraction and library construction of snow leopard historical samples

Additional file 2. Contains supplementary tables S1–S27 and their legends

Additional file 3. Contains supplementary figures S1–S13 and their legends

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, L., Jin, H., Yang, Q. et al. Genomic evidence for low genetic diversity but purging of strong deleterious variants in snow leopards. Genome Biol 26, 94 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13059-025-03555-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13059-025-03555-0

Keywords