- Research
- Open access
- Published:
Genomic signals of ecogeographic adaptation in a wild relative are associated with improved wheat performance under drought stress
Genome Biology volume 26, Article number: 35 (2025)
Abstract
Background
Prioritizing wild relative diversity for improving crop adaptation to emerging drought-prone environments is challenging. Here, we combine the genome-wide environmental scans (GWES) in wheat diploid ancestor Aegilops tauschii (Ae. tauschii) with allele testing in the genetic backgrounds of adapted cultivars to identify diversity for improving wheat adaptation to water-limiting conditions.
Results
We evaluate the adaptive allele effects in Ae. tauschii-wheat introgression lines phenotyped for multiple traits under irrigated and water-limiting conditions using both unmanned aerial system-based imaging and conventional approaches. The GWES show that climatic gradients alone explain more than half of genomic variation in Ae. tauschii, with many alleles associated with climatic factors in Ae. tauschii being linked with improved performance of introgression lines under water-limiting conditions. We find that the most significant GWES signals associated with temperature annual range in the wild relative are linked with reduced canopy temperature in introgression lines and increased yield.
Conclusions
Our results suggest that introgression of climate-adaptive alleles from Ae. tauschii has the potential to improve wheat performance under water-limiting conditions, and that variants controlling physiological processes responsible for maintaining leaf temperature are likely among the targets of adaptive selection in a wild relative. Adaptive variation uncovered by GWES in wild relatives has the potential to improve climate resilience of crop varieties.
Background
The wild relatives of modern crops are a valuable source of adaptive diversity for developing improved varieties [1,2,3]. However, only a small fraction of wild relative diversity from the germplasm collections is utilized in crop improvement programs. The size of these collections, which may include thousands of accessions, complicates the selection of the most relevant genotypes for improving traits of interest. The prioritization of genebank germplasm for breeding climate adapted varieties is especially challenging due to the polygenic nature of adaptation to environments [4, 5]. Therefore, the development of effective strategies, which are aimed at prioritizing wild relative germplasm for breeding varieties adapted to local climates, remains critical [6].
The allopolyploid bread wheat, the second most important crop worldwide, originated by the hybridization of three wild grass species from the Triticum and Aegilops genera [7,8,9,10,11,12,13]. Since its origin 10,000 years ago, wheat was disseminated by human migration and trade to diverse geographic regions with distinct climatic conditions [14]. Archeological records and analyses of ancient DNA samples suggest that wheat reached Britain about 8000 years ago (YA) [15] and China and Africa about 3000 YA [16]. Selection for performance in these diverse environments enriched local wheat populations for alleles contributing to adaptation to new climatic conditions [17, 18]. However, the future climate change scenarios predict that climatic conditions in many wheat growing areas could be outside of the adaptive range of existing genotypes and lead to severe yield reduction [19]. The results of climate modeling suggest that nearly 40% of crop growing areas might require new varieties better adapted to new environmental conditions to sustain crop production [20, 21]. Leveraging the genetic diversity of multiple wild ancestors of wheat, which evolved to grow in diverse environments, is one of the promising strategies for broadening the climate adaptive potential of modern wheat.
The direct ancestors of wheat, Aegilops tauschii and Triticum turgidum ssp. dicoccoides (wild emmer), are two most broadly distributed species among the wheat wild relatives [7, 10, 11, 13, 22, 23]. The signatures of extensive ancient introgression from these wild ancestors were detected across the wheat genome [17, 24, 25]. These ancient introgressions increased diversity genome-wide and contributed to environmental adaptation [17]. Both Ae. tauschii and T. turgidum are also among the most represented in germplasm collections, some of which host thousands of accessions of these species [26]. Because these wild ancestors of wheat share homologous genomes, their chromosomes could easily recombine, facilitating introgression of allelic diversity from Ae. tauschii and wild emmer into wheat [27, 28]. The direct introduction of allelic diversity from these wild ancestors into international wheat breeding programs [29, 30] performed using the synthetic hexaploid wheat (SHW) lines, which are hybrids of tetraploid wheat and Ae. tauschii, demonstrated their potential to improve adaptation to water limiting conditions and heat, and increase biomass and harvest index [31, 32]. However, considering the broad adaptive range of these species reflected in their wide geographic distribution, the question remained of how effective these prior efforts were at capturing adaptive diversity represented in Ae. tauschii and wild emmer.
The prioritization of wild relative accessions for developing climate resilient crops remains challenging. Wild relatives can be phenotypically pre-screened for target traits [33, 34]. However, this screening could be performed only for simple traits and has limited utility for complex adaptive traits if phenotypic evaluation was not performed in the genetic background of adapted cultivars. Another approach to prioritize accessions is the development of “core collections” assembled from genotypes selected to maximize the genetic diversity of the sample [35]. While this strategy could effectively reduce the number of accessions, its major disadvantage in application to large collections is that it targets only common adaptive alleles, removing locally adaptive alleles or allelic complexes. The third approach is based on selection of wild relative accessions based on environmental parameters at the site of accession’s origin [28, 36, 37]. However, while this strategy could capture alleles contributing to an adaptive phenotype of small number of accessions collected at specific location, its ability to recover adaptive genetic diversity at species-wide level would be limited.
A combination of genome diversity analyses with the geographic patterns of environmental variation for detecting adaptive diversity is another strategy that so far had limited usage in crop breeding. The cost-efficiency of next-generation sequencing (NGS) genotyping approaches made possible generating genome-wide variation for geographically diverse populations. By combining genomic data with ecogeographic (climatic, bioclimatic and geographic) variables, it became possible to identify alleles associated with adaptive phenotypes. These approaches, referred to as genome-wide environmental scans (GWES), identify loci involved in local adaptation based on a high correlation between allele frequencies and eco-geographic variables. In an early GWES study, a number of climate-associated alleles (CAA) were mapped in Arabidopsis by using 13 climatic variables, among others including extremes and seasonality of temperature and precipitation [38]. The CAA identified by GWES allowed for accurate prediction of the relative fitness of Arabidopsis accessions in local environments [38,39,40]. The GWES in sorghum and Mexican white oak detected adaptive variants that also produced reliable phenotypic predictions [37, 41]. These studies suggest that adaptive alleles identified using the GWES have potential to predict agronomic phenotypes in target environments.
Though GWES were shown to be effective at identifying loci contributing to environmental adaptation, it remains unclear whether these loci could be used to prioritize wild relative accessions for introgression into modern crop varieties to improve their adaptive potential in extreme environments. To address this question, we used a diverse collection of Ae. tauschii to conduct GWES and identified variants contributing to climatic adaptation. We specifically focused on those variants that correlate with precipitation and temperature gradients during growth season. Then, we selected a geographically diverse set of Ae. tauschii accessions to develop introgression populations by crossing them with the adapted wheat varieties. The developed introgression population was grown for several seasons across diverse environments, and agronomic performance of introgression lines (ILs) was assessed by measuring agronomic and physiological traits. The physiological status and growth of ILs were evaluated using the UAS-based phenotyping with the RGB and thermal cameras. The wheat productivity was assessed by measuring yield and yield component traits (thousand grain weight, grain area, grain width, and grain length). The relationship between phenotypic data and climate adaptive alleles introgressed from Ae. tauschii was investigated to better understand the value of GWES in wild relatives as a tool for selecting wild relative accessions to improve the adaptive potential of wheat varieties.
Results
Genome-wide scans for environmental adaptation signals in Aegilops tauschii
The genetic basis of Ae. tauschii adaptation to diverse climatic conditions across a broad geographic range extending from Eastern Europe to China remains poorly understood. To identify genetic loci contributing to adaptation, we conducted genotype-environment association analyses using a subset of 109,627 LD-pruned SNPs (r2 < 0.5) identified in a geo-reference panel of 137 accessions (Additional File 1: Table S1). This set of SNPs was selected from a larger set including 6,365,631 genotyped and imputed SNPs. Using this data, we assigned each accession to the previously identified four main lineages (L1W, L1E, L2E, L2W) of Ae. tauschii [23] (Fig. 1A), where L1 was composed of Ae. tauschii ssp. tauschii and L2 included accessions of Ae. tauschii ssp. strangulata, the closest ancestor of the wheat D genome (Additional File 2: Fig. S1A). The first two principal components, separating L1 and L2 lineages, accounted for 78.4% of the variation in our population (Additional File 2: Fig. S1B).
Ecogeographic distribution of 137 diverse Ae. tauschii accessions. A Map shows the geographic locations and the ancestry coefficients of 137 accessions. Twenty-one accessions used to develop the Ae. tauschii-wheat introgression population are shown in blue. B Venn diagram showing the proportions of SNP variance explained by climate (Clim) and geographic (Geo) factors. C Redundancy analysis (RDA) plot showing the 11 best explanatory variables for SNP variance in Ae. tauschii accessions including temperature annual range (bio7), precipitation of driest month (bio14), precipitation in May (prec_5), minimum temperature in April, November and December (tmin_4, tmin_11 and tmin_12, respectively), maximum temperature in May, June, July, and September (tmax_5, tmax_6, tmax_7 and tmax_9, respectively), and longitude (lon). D Heatpoint map showing the variation in temperature annual range (bio7) at the sampling locations of the Ae. tauschii accessions. E Boxplots showing the variation in lon, bio7, bio14, tmax_5, tmax_6, tmax_7, tmax_9, tmin_4, tmin_11, and prec_5 in the ecogeographic locations of different Ae. tauschii lineages. Lineage 1 East (L1E) and Lineage 1 West (L1W) belong to Ae. tauschii ssp. tauschii whereas Lineage 2 East (L2E) and Lineage 2 West (L2W) belong to Ae. tauschii ssp. strangulata
To identify variants contributing to local adaptation, we modeled the relationship between response variables (SNPs) and explanatory variables (climatic and bioclimatic factors, and geographic distance) using redundancy analysis (RDA) [37, 42, 43]. For this purpose, we used historical data for the bioclimatic and climatic factors estimated for the geographic locations at the accession collection sites. The total SNP variance explained by both geographic distance and climatic and bioclimatic factors was 85.2% (Fig. 1B), with the adjusted R2 value being 69.6%. Climate alone accounted for 57.8% of the SNP variation in Ae. tauschii, whereas geographic distance between accessions and the interaction between geographic distance and climate accounted for 5.6 and 11% of SNP variation, respectively. These results indicate that the distribution of SNP variation among Ae. tauschii accessions is primarily driven by gradient in climatic and bioclimatic factors rather than by Ae. tauschii geographic dispersal.
Depending on their impact on adaptive traits, individual climatic factors could have distinct effects on SNP variation [37, 38]. A triplot with two RDAs shows separation of the population into four distinct groups (Additional File 2: Fig. S1C) coinciding with the previously detected split between the L1E, L1W, L2E, and L2W sub-lineages. The first two RDAs based on the 11 best explanatory ecogeographic variables accounted for 57.45% of SNP variance in Ae. tauschii accessions (Fig. 1C). Among the geographic variables, longitude and altitude showed the strongest effect on SNP distribution between the two subspecies of Ae. tauschii followed by latitude. The first two RDAs based on geographic variables alone accounted for 22.58% of the total SNP variation in the population (Additional File 2: Fig. S1D). The ecogeographic variables contributing most to SNP variation were determined using the “ordiR2step” function in R package “vegan” using the forward selection method and 10,000 permutations [44]. A total of 11 variables were detected, including temperature annual range (bio7), precipitation of driest month (bio14), precipitation in May (prec_5), minimum temperature in April, November, and December (tmin_4, tmin_11 and tmin_12, respectively), and maximum temperature in May, June, July, and September (tmax_5, tmax_6, tmax_7 and tmax_9, respectively), and longitude (lon) (Table 1, Fig. 1C). The tmin_11, tmin_12, bio7, and bio_14 contributed most to RDA1 that explains most of the genetic differentiation between the two subspecies of Ae. tauschii. The prec_5, tmin_4, tmax_5, tmax_6, tmax_7, and tmax_9 factors contributed to RDA2 that explains most of the genetic differentiation between the Eastern (L1E, L2E) and Western (L1W, L2W) populations of the L1 and L2 Ae. tauschii lineages. These results indicate that temperature and precipitation gradients during the growth periods coinciding with flowering, grain filling, and maturation were the main factors that shaped SNP diversity in Ae. tauschii and likely contributed to genetic differentiation among the four sub-lineages.
Two subspecies of Ae. tauschii, ssp. tauschii (L1 lineage) and ssp. strangulata (L2 lineage) appear to show different levels of adaptation to distinct ecogeographic conditions. We compared the distribution of the main ecogeographic factors (lon, bio7, bio14, tmax_5, tmax_6, tmax_7, tmax_9, tmin_4, tmin_11, tmin_12, and prec_5) between the two subspecies of Ae. tauschii (Fig. 1E). Analysis of variance showed significant differences between the accessions from these subspecies (Table 2). Results suggest that L1E sub-lineage is adapted to warmer and drier conditions of Eastern Iran, Afghanistan, Turkmenistan, Uzbekistan, Tajikistan, and Kyrgyzstan (Additional File 1: Table S1), indicating that these Ae. tauschii accessions could be a good source of drought and heat stress tolerance. The lowest precipitation of driest month characterized by high maximum temperature from May up to September was one of the major differentiating ecogeographic factors for L1E. In contrast, L1W is represented by accessions mostly from Eastern Turkey and Northwestern Iran where a high precipitation is recorded in May and significantly lower maximum temperature from May to September. The factors that contribute most to the genetic differentiation of this sub-lineage are tmin_4, tmax_5 and prec_5.
Generally, Ae. tauschii ssp. strangulata sub-lineages L2E and L2W are found around the Caspian Sea with some accessions found in Syria and Turkey. L2E is adapted to a relatively uniform precipitation and mild temperature which are characteristic of the Southern Caspian Sea in Northern Iran. The L2W accessions are mostly found near Western Caspian Sea in Azerbaijan and parts of Northwestern Iran. The region is characterized by moderate to high variation in climatic and bioclimatic factors. Among the main variables differentiating L2W from other sub-lineages are tmin_4 and prec_5 (Table 2).
Mapping adaptive SNPs in Ae. tauschii
The results of both redundancy analysis (RDA) and genome-wide association mapping (GWAS) were combined to identify SNPs that are significantly associated with variation in ecogeographic variables across the Ae. tauschii sampling locations. The first three RDA loadings for each SNP were extracted from the RDA model, transformed to Z-scores, and climate associated alleles (CAA) were defined as outlier SNPs with three standard deviations from the mean Z-score. After removing duplicate SNPs, a total of 10,149 D-genome SNPs showed a positive correlation (mean = 0.58, range 0.14–0.82) with 51 out of 58 variables analyzed in our study (Additional File 3: Table S2, Additional File 4: Table S3).
Temperature annual range (bio7) was ranked as the most significant variable accounting for 27.7% of the SNP variation in Ae. tauschii population based on the adjusted R2 values (P < 0.001). It had the highest number of correlated SNPs (747) among other most significant variables. Chromosomes 1D and 7D had the highest number of SNPs showing significant correlation with ecogeographic variables (Fig. 2A). Genome-wide association mapping is another approach that was previously used for studying genome-by-environment interactions [45]. By using a compressed mixed linear model, we identified a total of 10,569 D-genome SNPs significantly associated with 42 out of 58 variables (Fig. 2B, Additional File 3: Table S2). Most of these variants were located on chromosomes 1D and 5D (Fig. 2A and Additional File 5: Table S4). Unlike in RDA analysis, where each SNP was assigned to a single highly correlated variable, in GWAS, many SNPs were associated with more than one variable at FDR ≤ 0.05. Combined, RDA and GWAS identified 18,096 SNPs with significant association to ecogeographic variables (Additional File 6: Table S5). Among the SNPs identified, a set of 2622 SNPs (14.5%) were detected using both methods (Fig. 2B, Additional File 7: Table S6). The functional annotation of these SNPs using SnpEff [46] showed that only 29 of them were stop codon gain, missense, synonymous, intronic, or splice region variants (Additional File 7: Table S6). Most SNPs (2316) were intergenic variants, and 277 SNPs were located 5 kb upstream or downstream of gene models.
Number of climate associated SNPs per chromosome identified by the redundancy analysis (RDA) and genome-wide association analysis (GWAS), and the geographical distribution of SNP alleles associated with precipitation of driest month (bio14). A Chromosome distribution of SNPs identified through RDA and GWAS that were associated with different ecogeographic variables. B Venn diagram showing the total number of climate associated SNPs identified by RDA and GWAS. C Manhattan plot showing GWAS for four of the most significant variables: minimum temperature in April (tmin_4), maximum temperature in May and June (tmax_5 and tmax_6) and bio14. The black line shows an FDR threshold of 0.05. D Ae. tauschii (AeT) specific allele (yellow) on chromosome 1D showing adaptation to areas with high precipitation of driest month near the Caspian Sea. E Ae. tauschii-specific allele (yellow) on chromosome 1D showing adaptation to areas with a wide range of reduced precipitation in the driest month. The purple color shows the reference allele in cv. Chinese Spring (CS)
Consistent with a pioneering study in Arabidopsis that investigated genomic basis of adaptation to climate [38], the geographic extent of CAAs in Ae. tauschii can primarily be explained by the distribution of climatic and bioclimatic factors. For example, among SNPs significantly associated with bio14, there is an Ae. tauschii allele (chr1D_322068171, here and henceforth, SNP positions are shown relative to Chinese Spring RefSeq v1.0) that was found only in accessions from the region near the Caspian Sea (Fig. 2D), which shows a high precipitation of driest month (Table 1). Another bio14-associated SNP (chr1D_322557976) had an allele identified in accessions from a large geographic region characterized by low precipitation of driest month and experiencing extreme drought stress due to high temperature from May up to September (Fig. 2E, Table 1). Our results indicate that Ae. tauschii is a source of adaptive alleles to a broad range of climatic factors that could be useful for mitigating the impact of climate change on wheat productivity.
Evaluation of the adaptive potential of Ae. tauschii CAAs in winter wheat
To evaluate the ability of CAAs identified in Ae. tauschii to improve the adaptive potential of bread wheat, we assessed their effects on agronomic performance traits in the genetic backgrounds of wheat cultivars adapted to grow in the Great Plains of North America. For this purpose, we developed a wild relative introgression population using a set of 21 diverse Ae. tauschii accessions that were selected to capture the ecogeographic and allelic diversity of species [27, 47]. To facilitate comparison with parental lines, introgression lines (ILs) were selected to match development and phenology of hexaploid wheat parents [27]. Out of 18,096 climate adaptive SNPs identified by RDA and GWAS, 31.4% (5675 CAA SNPs) were present in the introgression population. The loss of some of the CAAs in introgression population could be attributed to their linkage with alleles that were either deleterious or strongly affected plant development and phenology and, therefore, were selected against during population development (Fig. 3). Among the introgressed CAAs, a total of 1089 SNPs were detected using in both RDA and GWAS.
Frequency spectra of climate associated alleles (CAA) in the tails of phenotype distribution relative to the allele frequency spectra in the whole population (WP). A The left and right panels show the 5th and 95th percentiles of the yield, spikelet number per spike (SNS), deviation in days to heading (DDTH) from the control mean DTH, thousand grain weight (TGW), grain length (GL), grain width (GW), canopy temperature (CT), and normalized difference vegetation index (NDVI). B Chromosome distribution of allele counts for CAA differentiated in frequency between the upper and lower tails of phenotype distribution. Differentiation in allele frequency was determined by calculating the fold change (FC) in allele frequency between WP and ILs from the tails. CAAs with FC ≥ 2 were considered to be strongly differentiated. The CAAs detected for most significant ecogeographic variables were included into the analyses. Each chromosome arm was split into three bins with each bin representing 33.3% of arm length from centromere (Cent). The mean counts of CAAs in each bin across all chromosomes in the wheat genome were combined. The left and right panels show chromosome distributions for differentiated CAAs in the 5th and 95th percentile tails of phenotype distribution, respectively
Introgression of adaptive alleles can occur in either high or low recombining regions of the genome. While introgressed segments in high recombining regions get shortened after a few generations of recombination, those in the low recombining regions tend to persist across many generations as large linkage blocks. The adaptive SNPs from the large introgression segments in the pericentromeric regions of chromosomes could be linked with deleterious alleles or alleles involved in epistatic interaction with genetic background of a parental line. This linkage can impact the efficiency of selection for traits associated with adaptive SNPs [48]. When trait selection is applied, the frequency of adaptive alleles in the high recombining regions usually increases, whereas the frequency of adaptive alleles in low recombining regions may either (i) increase if SNP effect on an adaptive trait is stronger than the combined negative effects of alleles linked with adaptive SNP or (ii) decrease if the combined negative effects of linked alleles are stronger than the effects of adaptive SNPs. In the populations of individuals, differences in allele frequency for trait-associated SNPs are most pronounced in the tails of trait distributions. To assess association of CAAs with variation in wheat agronomic performance traits, we compared the frequency spectra of CAAs derived from Ae. tauschii in the tails of phenotype distributions in the introgression population. Our expectation was that if CAAs affect wheat performance, the upper and lower tails of distribution for yield and yield component, canopy temperature (CT), and deviation in days to heading (DDTH) traits should show distinct CAA frequency spectra. If CAAs are not associated with wheat performance, the CAA frequency spectra in the tails of trait distributions should not be different from that in the whole population (WP). For this purpose, we ranked the ILs based on trait values and compared the CAA frequency spectrum in WP with the CCA frequency spectra in the lower and upper 5th percentile tails of trait distributions (Fig. 3A). The frequency spectra were generated for the 5675 CAAs in the introgression population by counting Ae. tauschii alleles in the 9 allele frequency bins. Lines in the tails of the trait distribution were filtered to retain only those that ranked in the same percentile group for at least two traits.
A significant shift from the mean CAA frequency (0.226) in the WP was observed in the tails of phenotype distributions for many traits (Fig. 3A). The CAA frequency spectrum in the grain yield distribution tails was significantly different from WP mean (Kolmogorov–Smirnov test: lower tail P < 2.2e − 16; upper tail P < 2.2e − 16). In the lower tail of yield distribution, the CAA frequency was shifted to higher values relative to mean, whereas in the upper tail with high yielding ILs the majority of CAAs had frequency below WP mean. These two opposing trends are likely associated with over- or under-representation of ILs with large introgression segments in the lower or upper tails, respectively. These large introgressions would carry multiple CAAs that are in LD with deleterious or epistatically interacting alleles having yield penalty, and therefore, over- or under-represented, respectively, in the tails of trait distribution. However, the presence of a second CAA frequency peak above the WP mean in the upper tail of yield distribution indicates that some CAAs are over-represented in high-yielding ILs and associated with improved wheat performance.
Yield is a complex trait modulated by changes in days to heading (DTH) and yield component traits, such as spikelet number per spike (SNS), thousand grain weight (TGW), grain area (GA), grain width (GW), and grain length (GL). Previous studies have shown a positive relationship between yield and SNS [49, 50]. Likewise, increase in DTH positively correlated with SNS and yield [47, 51]. The patterns of CAA frequency shifts in the tails of distributions for these traits were consistent with the established relationships between DTH, SNS, TGW, and yield. The ILs with high SNS showed significant increase in the frequency of many CAAs (Fig. 3A), whereas lines with low spikelet number per spike carried mostly low-frequency CAAs (Kolmogorov–Smirnov test: lower tail P < 2.2e − 16; upper tail P < 2.2e − 16). The shift in the CAA frequency spectrum for the DDTH, TGW, and GW distributions from the WP mean followed the same pattern as that observed for SNS (Kolmogorov–Smirnov test: lower tail P < 2.2e − 16; upper tail P < 2.2e − 16). Combined, these results suggests that a substantial portion of CAAs is associated to positive changes in yield component traits and heading date, and likely contribute to increased yield in ILs.
Canopy temperature (CT) is a physiological trait that can be used to identify genotypes adapted to grow in water-limiting environments [52, 53]. CT can be linked to QTLs that control root architecture necessary for improved water use efficiency and maintenance of transpiration rate [54]. In our data, CT negatively correlated with yield under drought stress (r = − 0.45, P ≤ 0.01), which agreed with the previous studies that showed the importance of CT depression for increasing yield in wheat [32, 55, 56]. While in the low CT tail, most CAAs had lower than average allele frequency, there were still many CAAs that significantly increased in frequency compared to population mean, suggestive of their association with low CT.
We also measured normalized difference vegetation index (NDVI), which often is used to assess wheat growth vigor and predict yield [57]. The significant shifts in CAA frequency relative to WP mean were detected in the NDVI distribution tails (Fig. 3A). The ILs with high NDVI tended to carry CAAs whose frequency was above the WP mean, indicating that some CAAs could be associated with the positive impact on this vegetation index. The finding of CAAs showing strong shift in the lower tails of both traits suggest that some CAAs or linked introgression variants could be associated with the negative impact on NDVI too.
As mentioned above, the negative impact of linkage drag on selected alleles and associated traits should decrease with an increase in the recombination rate [48]. To understand the potential impact of linkage drag on CAAs in introgression lines, we plotted the counts of CAAs that show significant shifts in alleles between the trait tails and the WP mean along the wheat chromosomes (Fig. 3B). For CAAs linked with trait variation, we expected to observe higher frequency differentiation between the lines in the phenotypic tails in the high-recombining terminal regions of chromosomes rather than in the low-recombining pericentromeric regions. Consistent with this expectation, by plotting CAAs identified for the eleven most significant ecogeographic variables along the chromosomes, we show that differentiated CAAs are enriched in the high-recombining regions of chromosomes (Fig. 3B). These results suggest that phenotypic differentiation in the tails of trait distribution is associated with differentiation in CAA frequency and likely contributed to improved performance of some ILs.
CAAs are linked with variation in adaptive traits in the introgression population
Association analyses were performed in the introgression population using a set of 5675 climate-adaptive SNPs to determine variants that contribute to improved performance of introgression lines under the water-limiting and irrigated conditions. The CT, DDTH, yield, and yield component traits were used as plant performance metrics. Significant CAA-trait associations with DTH, SNS, GW. and GL were observed in the introgression population (Additional File 8: Table S7, Additional File 2: Figs. S2 and S3), suggesting that Ae. tauschii haplotypes harboring adaptive variants are also associated with variation in flowering time and yield component traits. Considering the importance of CT for assessing the physiological response of plants to drought stress [53, 54], more in-depth analyses were performed for CAAs and CT.
The CT data were collected using unmanned aerial system (UAS)-based thermal imaging from both irrigated and non-irrigated field trials at multiple time points during the growing season in 2018 and 2019. Variation in CT was influenced by both genotype and environment (Table 3). In the 2018 growing season, narrow sense heritability (h2) for CT varied between 0.54 and 0.85 whereas in 2019 it ranged from 0.24 to 0.78. In the 2019 growing season, residual variance in CT was much higher than due to genotype effect. This could be linked to the fact that in 2019, location in Colby (KS) experienced high precipitation and low temperature conditions during the growing season reducing difference between irrigated and non-irrigated trials. Best linear unbiased predictors for spatially corrected CT varied from − 0.57 to 1.8 suggesting that introgression from Ae. tauschii reduced CT relative to checks and population mean (Fig. 4A). A comparison of yield and CT showed a strong negative relationship with CT accounting for 30% of yield variation in the Ae. tauschii introgression population (Fig. 4B). This result was confirmed by performing phenomic predictions using the random forest model with CT and yield component traits as predictors of yield (Additional File 2: Fig. S4). These analyses showed that CT is the most significant factor for predicting yield followed by grain length and thousand grain weight in this population, consistent with previous observations [58].
Relationship between canopy temperature (CT) and yield performance of the introgression lines, the genomic loci associated with CT and origin of Ae. tauschii providing CT-lowering alleles in hexaploid wheat background. A CT distribution for the introgression population at Colby in 2018 and 2019 growing seasons. The shaded tails represent the 5th and 95th percentiles. B Regression of yield on CT. C Yield of ILs showing low CT (CTL) and high CT (CTH) relative to recurrent parents CT (CTRP). Different letters on top of the box plots indicate significant differences at 95% confidence level. D–F Manhattan plots showing the quantitative trait nucleotides (QTN) associated with CT, where D and E are based on CAA SNPs tested using the MLMM and BLINK models, respectively, and F is based on LD-pruned SNPs tested using the MLMM model. The red line in the Manhattan plot corresponds to FDR threshold of 0.05. F–H Heatpoint maps showing temperature annual range (bio7), min temperature in the coldest month (bio6) and mean temperature in coldest quarter (bio11) in the geographic origin of Ae. tauschii accessions. Accessions in red circles are the parents for introgression lines with low CT. I–K Distribution of bioclimatic variables (bio7, bio6, and bio11) in geographical origin of 137 Ae. tauschii accessions. Orange dots represent Ae. tauschii accessions used to generate introgression lines that rank in the 5th percentile for CT distribution
To further understand the impact of introgressed alleles from Ae. tauschii into hexaploid wheat background on yield, we identified ILs in the 5th and 95th percentiles of CT distribution and compared them to recurrent parents. The average yield for ILs in the 5th percentile of CT was 50 bpa, which was higher but not significantly different from the recurrent parents (48 bpa, P = 0.825). The lack of significant difference could be attributed to high variation in yield in ILs showing low CT, suggesting that other genetic factors could contribute to final yield. The introgression lines in the 95th percentile, however, showed significant yield reduction (39 bpa) relative to the recurrent parents (Tukey HSD, P < 0.008), confirming the importance of CT trait for predicting grain yield in wheat.
To identify the genomic loci associated with CT depression, we performed GWAS using genotypes at CAAs [59, 60]. The multiple-locus mixed linear model (MLMM) and Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway (BLINK) revealed significant SNP-trait associations on chromosomes 1D, 2D, 4D, and 7D (Fig. 4D–F, Additional File 9: Table S8). Some of the most significant SNPs associated with CT on chromosomes 1D and 4D (chr1D_265243957 and chr4D_154179120) showed the highest correlation with temperature annual range (bio7, r = 0.59). Based on the RDA analyses, bio7 was identified as the most significant variable shaping SNP variation in Ae. tauschii. Other SNPs significantly associated with CT in introgression population correlated with the minimum temperature in the coldest month (bio6, r = 0.67), mean temperature in the coldest quarter (bio11, r = 0.6), longitude, and precipitation in the driest months (August to October) (Additional File 9: Table S8).
Besides using only SNP sites with CAAs for GWAS, 5.3 million SNPs from Ae. tauschii introgression population were pruned based on LD resulting in 99,529 SNPs with r2 ≤ 0.5. When GWAS was performed using this set of SNPs, the MLMM model revealed three regions that were associated with CT, including one on chromosome 1D and two on 4D. The significant SNPs on 1D were chr1D_254646871 and chr1D_265399733 (Fig. 4F, Additional File 9: Table S8). Although these SNPs were not part of the SNP set detected in the environmental scans of Ae. tauschii accessions, they were located within the same genomic intervals identified by genome-wide association mapping in the introgression population using the climate-associated SNPs. Within the 254–266-Mb interval, there were 191 CAAs (Additional File 4: Table S3) that highly correlated with mean diurnal range (bio2), temperature annual range (bio7), mean temperature of driest quarter (bio9), minimum temperature in March (tmin_3), precipitation in September (prec_9), and longitude (lon). Similarly, the first QTL on 4D contains SNP chr4D_92689640 (Additional File 9: Table S8), and in the interval 90–94 Mb on 4D, four CAA SNPs identified in Ae. tauschii accessions were strongly correlated with bio7. The second QTL contains SNP chr4D_229986737 (Additional File 9: Table S8), and a search for CAA SNPs 5 Mb to the left and right of the significant QTN identified 31 SNPs that were correlated with prec_9, tmin3, and lon variables.
The source of alleles lowering CT in the introgression lines were mostly from Ae. tauschii ssp. tauschii accessions (TA2388, TA2536, TA2521, and TA10177), collected from areas such as Afghanistan, Iran, and Pakistan known for high bio7 (43), on average with nearly no precipitation in the driest quarter of the year (Fig. 4G, J). These results suggest that Ae. tauschii growing in high temperature and low precipitation conditions could improve wheat adaptation to water-limiting conditions when introgressed into adapted wheat background.
Discussion
The lineages of Ae. tauschii are spread over a large geographic area with a wide range of variation in climatic factors, including some of the locations with extremely dry and hot environments [7, 22, 23]. The existence of strong SNP-climate correlations reported here provides effective means for detecting climate adaptive variants in diploid Ae. tauschii using environmental genome scans. The range of geographic distribution for adaptive variants varied broadly with some alleles showing narrow geographic distribution, and other alleles showing broad distribution across large geographic areas. Consistent with prior studies, these spatial patterns of allele distribution can primarily be explained by the distribution of climatic factors [37, 38, 61]. In our analyses the climatic (climatic and bioclimatic) factors alone accounted for a substantial proportion (57.8%) of spatial genetic variation in Ae. tauschii with relatively small contribution from geographic dispersal (5.6%). These results suggest consistent environmental gradients across the Ae. tauschii distribution range likely shaped the spatial structure of genomic variation in this wild ancestor of wheat and contributed to genetic differentiation between its main lineages.
Correlations between genomic diversity and climatic factors indicate that the temperature and precipitation gradients during the growth periods coinciding with flowering, grain filling, and maturation contributed to genetic differentiation among the four main sub-lineages of Ae. tauschii and its two subspecies, strangulata and tauschii. The lowest precipitation in driest month was one of the major differentiating ecogeographic factors for L1E, whereas temperature and precipitation gradients in April and May contributed most to the genetic differentiation of L1W sub-lineage. The variation in precipitation and temperature in April and May were among the main ecogeographic factors explaining differentiation between the L2W and L2E sub-lineages of Ae. tauschii ssp. strangulata. The sub-lineage L2E of Ae. tauschii ssp. strangulata, which contributed 10,000 years ago to the origin of bread wheat [23], grows in a narrow geographic region south of the Caspian Sea with limited variation in climatic factors characteristic of the humid mild subtropical environments. As a result, adaptive diversity captured by the D genome of bread wheat is primarily restricted to those alleles that are represented in this region. The limited levels of gene flow detected between wheat and Ae. tauschii ssp. strangulata did not have dramatic impact on the genetic diversity of the D genome [17, 22,23,24]. Thus, the polyploidization bottleneck associated with wheat origin resulted in not only the overall loss of genetic diversity in the wheat D genome [17, 22, 62] but also in the massive loss of adaptive alleles represented in all four sub-lineages of Ae. tauschii. While the consequences of the loss of these alleles in wheat are hard to predict, we might expect that it had a negative impact on the adaptive potential of hexaploid wheat and offset progress with development of drought-resilient wheat varieties.
Introgression from Ae. tauschii into hexaploid wheat had positive effects on traits playing an important role in increasing crop productivity and improving adaptation to drought stress. In our previous study, we showed that 3.2% of introgression lines carrying Ae. tauschii haplotypes outperformed parental lines in drought trials [47]. Consistent with these results, several high-yielding drought tolerant cultivars have been derived from synthetic wheat lines created using Ae. tauschii as one of the parents in the past [32, 54, 63]. Our analyses suggest that improved performance of wheat introgression lines could be largely attributed to introduction of climate-adaptive alleles that show association with environmental variation in Ae. tauschii. Statistically significant shifts in allele frequency in the extreme tails of phenotypic trait distributions and significant associations detected for climate-adaptive alleles in GWAS for canopy temperature and productivity traits support this conclusion.
The signatures of adaptation detected by environmental scans in genome can be driven by complex historic gradients of environments varying in space and time, and therefore, associated with diverse adaptive mechanisms [61, 64]. As a result, usually it is difficult to establish relationships between adaptive alleles from environmental scans and specific phenotypic traits measured for introgression populations in field trials. However, in our study, the temperature annual range (bio7) was among the main bio-climatic factors that contributed most to shaping the spatial genomic variation in Ae. tauschii. The SNP locus located on chromosome 4D and showing strongest association with bio7 in environmental scans also showed strongest association with variation in canopy temperature in introgression lines. This result indicates that among the targets of selection imposed by variation in bio7 are variants associated with pathways controlling physiological processes responsible for maintaining canopy temperature under drought stress [65]. In the field trails, lines carrying Ae. tauschii alleles at loci associated with reduced canopy temperature were among the top yielding introgression lines suggesting that these Ae. tauschii alleles improve adaptation to drought stress and likely act as the main drivers of increased yield. Likewise, detection of Ae. tauschii introgression into chromosome 6D associated with reduction in canopy temperature and increased yield under dry conditions, confirms the importance of this adaptive mechanism for drought tolerance [32, 52]. These results indicate that environmental scans focusing on the relevant bio-climatic variables provide an effective means for uncovering variants in wild relatives that can improve wheat adaptation to water-limiting conditions and increase its yield potential.
Conclusions
Our study shows that whole genome sequencing of diverse collections of wild relatives integrated with environmental scans is an effective strategy for prioritizing wild relatives from germplasm banks for introgression into wheat. Continued reduction in the cost of genome sequencing and availability of reference genomes for the increasing number of wild relative species makes this strategy an attractive option for even large genebank collections including tens of thousands of lines [6, 66]. These approaches could quickly help to detect accessions enriched for alleles providing adaptation to target environments [64, 67]. As it was shown in our study, genome-wide introgression of prioritized diversity into adapted germplasm followed by fast high-throughput phenotyping using UAS-based imaging platforms could help to quickly identify promising germplasm for improving the adaptive potential of wheat. Expansion of these efforts from direct ancestors of bread wheat to include more distant Aegilops and Triticum species have potential to further broaden adaptive diversity accessible to wheat breeders for climate-proofing food production systems.
Methods
Plant materials
A diverse set of geo-referenced Ae. tauschii accessions collected over a geographic range of species distribution and representing locations with diverse historic climatic and bioclimatic characteristics was acquired from the USDA NSGC to identify the CAAs. This public germplasm collection had 137 accessions with precise GPS coordinates (Additional File 1: Table S1). A subset of 21 geographically diverse accessions (Additional File 1: Table S1) was selected from this population and crossed with hard red winter wheat varieties to generate Ae. tauschii-wheat amphiploids. The amphiploids were then crossed with six hard red winter wheat cultivars (KanMark, KS061406LN-26, KS061862M-17, KS061278M-4, Larry, Danby) adapted to grow in the US Great Plains to develop Ae. tauschii-wheat ILs [27]. A total of 351 BC1F3:5 introgression lines that had phenology similar to that of the recurrent parents were used to perform multi-year field trials and to investigate the effects of introgressed CAAs on the wheat performance traits (see below).
Genotyping and imputation
DNA was extracted from 2-week old seedling leaf tissues of the diverse Ae. tauschii accessions and the derived introgression population using DNeasy 96 Plant DNA extraction kit (Qiagen) following the manufacturer’s protocol. The quality and concentration of the DNA was assessed using PicoGreen dsDNA assay kit (Life Technologies). The extracted DNA was normalized to 400 ng (20 µl of 20 ng/µl) using the Qiagility robot (Qiagen). Genotyping was performed using GBS approach, which included a library size selection step performed using the Pippin Prep system (Sage Scientific) to enrich the library for 270–330 bp fragments, as described in Saintenac et al. [68, 69]. The prepared libraries were sequenced on Illumina NextSeq 500. Further, whole-genome sequencing was performed on a subset of 21 Ae. tauschii accessions and the six recurrent hexaploid wheat lines. PCR-free genomic libraries were constructed using Illumina protocol at the Integrated Genomic Facility (IGF) at Kansas State University. Paired-end sequences (2 × 150 bp) were generated using NovaSeq at Kansas University Medical Center and NextSeq 500 at IGF. Reads were mapped to the Chinese Spring IWGSC RefSeq v1.0 (IWGSC 2018) using the BWA aligner [70]. Variant calling was performed using the TASSEL v5.0 GBS v2 pipeline [71]. The genotyping datasets were combined and missing and ungenotyped SNPs in the Ae. tauschii diversity panel and the introgression population were imputed from the parental genotypes using Beagle v5.0 [72]. After imputation and filtering out SNPs with genotype probability below 0.7, we retained 6,365,631 SNPs in Ae. tauschii diversity panel and 5,208,054 SNPs in the introgression population.
Population structure and variance partitioning of SNP diversity in Ae. tauschii
To understand the level of genetic diversity within the Ae. tauschii population and how both geography and climate shaped the SNP variation in the population, we pruned the 6.3 million SNPs based on linkage disequilibrium (LD) using PLINK v1.9 and retained 109,627 SNPs that had r2 < 0.5 in a 50-kb sliding window with step size of 5 kb. The proportion of ancestry shared between accessions was estimated from the LD pruned SNPs and the geographical coordinates for the accessions’ collection sites using the tess3r R package [73]. The maximum number of ancestral populations tested was eight (K = 1:8). Each K was run 10 times for 200 iterations (rep = 10, max.iteration = 200) and the spatial projection of ancestral coefficients was based on least squares method (method = “projected.ls”). The optimal number of ancestral populations selected based on cross-validation scores was K = 4 consistent with the split of Ae. tauschii population into four sub-lineages. A plot showing the population admixture and the spatial distribution of accessions from different subspecies at the sites of sample collection was generated.
The proportion of SNP variance in the Ae. tauschii accessions was partitioned into those explained by geographic distance and climate using the “varpart” function in R package “vegan”. The geographic distances were calculated using the “distVicentyEllipsoid” function in R package “geosphere” using the GPS coordinates from the accessions’ collection sites and the results were presented in a Venn diagram.
Redundancy analysis
The diverse set of Ae. tauschii accessions used in this study came from a wide range of geographical locations with distinct climatic and bioclimatic conditions suggesting that certain genetic factors are involved in local adaptation. Based on this hypothesis, we modeled the relationship between response variables (SNPs) and explanatory variables (climatic and bioclimatic factors, and geographic distance) using the redundancy analysis (RDA) [37, 42] and mixed linear models to identify variants contributing to local adaptation. Variables were ranked to identify those that contribute most to SNP diversity in the Ae. tauschii population. To achieve this, the ordiR2step function was applied on the RDA results with adjusted R2 using the forward selection method from 10,000 permutations. Type 1 error was minimized during the selection of the most important factors contributing to SNP diversity by following the rules proposed by Blanchet et al. [44]. The full RDA model based on all ecogeographic (geographic, climatic, and bioclimatic) variables with the calculated adjusted R2 values was evaluated to determine variables that (1) significantly improved the explained variation of SNP diversity distribution in Ae. tauschii population at alpha 0.05, and (2) whose total adjusted R2 did not exceed the adjusted R2 value of the full model. Based on the aforementioned conditions, the most important variables identified were projected on the first two principal components as a triplot. To illustrate the variation of temperature annual range (BIO7) from the Ae. tauschii accessions’ collection sites, a heatmap was plotted using the “heat_point” function provided in R package “autoimage” and an overview of the variation of the most important variables at the collection sites for the sub-lineages was compared using boxplots. All variables were scaled to range between 0 and 1 by dividing with the highest value within the dataset and then squaring them to eliminate the negatives before generating the boxplots with ggplot2 in R.
Identification of climate associated alleles (CAAs) in Ae. tauschii
To determine the genetic basis of local adaptation in Ae. tauschii, we used both RDA and GWAS to identify SNPs that were significantly associated with ecogeographic variables. We extracted the first three RDA loadings for each SNP from the RDA model described above and transformed them to Z-scores. The mean Z-score was calculated from all SNPs and any SNP with three standard deviations from the mean was considered a candidate CAA SNP. Pearson’s correlation coefficients were used to determine the variable with the strongest association to each SNP, thus each CAA SNP was assigned to only one variable. To capture most CAA, we also performed a GWAS using a compressed mixed linear model in GAPIT [59]. The ecogeographic variables were used as phenotypes. The population structure was accounted for by including PCAs calculated from the marker data as covariates in the model. Multiple test correction was performed using the Benjamini-Hochberg’s method (FDR ≤ 0.05).
Phenotyping of Ae. tauschii introgression population
The population of ILs was phenotyped under field conditions for three seasons between 2018 and 2020 to evaluate the adaptive potential of CAAs introgressed in the winter wheat. In 2018 and 2019, phenotyping was done at Colby (Kansas, USA) under irrigated and non-irrigated conditions. In 2020, phenotyping was done at Ashland (Kansas, USA) under non-irrigated conditions. The experimental layout at all locations followed an augmented design with six recurrent hexaploid wheat parents and three additional winter wheat lines adapted to Kansas weather as controls. Experimental plots were 2.5 m × 0.5 m consisting of three rows separated by 18 cm. During planting, granular 18–46-0 diammonium phosphate (DAP) fertilizer was applied at a rate of 168.1 kg/ha and liquid 28–0-0 urea ammonium nitrate (UAN) was applied at a rate of 67.3 kg/ha in the spring to supply additional nitrogen to the plants. The lateral irrigation system was used to maintain the soil moisture in the irrigated block.
The ILs were phenotyped for yield and the component traits such as spikelet number per spike (SNS), thousand grain weight (TGW), grain width (GW), and grain length (GL). During the growing season, remote sensing data including NDVI and canopy temperature (CT) were collected at multiple time points during growth seasons using unmanned aerial system (UAS) mounted with specific sensors for each data type to evaluate the physiological status and growth trend of the introgression lines. The NDVI imagery data were processed in Agisoft software to generate orthomosaics and digital elevation models (DEM) whereas the thermal data were processed in Pix4D to generate the CT orthomosaics. The raster files generated by Agisoft and Pix4D were imported into QGIS v3.4 software for plot level data extraction. Shape files consisting of rectangular polygons that overlaid each plot in the experimental block were created and the mean pixel values for each color band within the polygon were calculated using raster zonal statistics tools and saved as a comma separated values (csv) file. NDVI was derived from near infrared and red color bands using the following equations:
where R and NIR are the mean pixel values for the red and near infrared color bands.
Heading data were collected in 2020 at Ashland and validated in 2022 at RockyFord, Manhattan, Kansas USA. Heading date was recorded when 50% of spikes fully emerged from the flag leaf. The number of days to heading (DTH) was calculated by subtracting the planting date from the heading date. To understand how much the heading date for the introgression lines varies from the controls, we calculated the mean DTH for the controls and subtracted the DTH for each introgression to generate the deviation in days to heading (DDTH).
Best linear unbiased predictions (BLUPs)
Best linear unbiased predictions for yield and yield component traits were obtained from a mixed linear model implemented in R package “sommer.” Given that the experimental layout followed an augmented design all controls were given a code 1, and the test introgression lines were assigned a 0. Each introgession line was assigned a unique numeric code which was used as a group identifier in all experiments whereas the controls were assigned a 999 regardless of the accession as the group identifier. A mixed linear model was run for different traits. For example, BLUPs for the number of days to heading, used in GWAS analysis, were estimated from the following model:
where DTH is number of days to heading, Loc is the field trial location, Check defined lines whether they are controls or test lines, and Acc is the accessions.
Canopy temperature data were collected over multiple time points (aka flights) in the 2 years. Spatial correction was performed using SpATS implemented in MrBean, a Shiny-based R package [74]. Variance due to genotype and environment were estimated as well as narrow sense heritability. After excluding outliers, BLUEs were predicted for the test lines based on the variance in the controls. All flights and blocks with heritability less than 0.2 were excluded from the linear mixed model. The remaining data were used to estimate genotype BLUPs across flights, treatment blocks, and years. In the model, experimental treatment and flights were considered as fixed effects whereas genotypes were considered as random effects.
Association between CAAs and phenotypic traits in introgression population
The frequency spectra of CAAs derived from Ae. tauschii was estimated for groups of ILs that had trait values falling into the tails of phenotype distributions. Our expectation was that if the CAAs are associated with variation in phenotypic traits in the introgression population, the phenotypic value tails should show CAA frequency spectra distinct from the CAA frequency spectra for the whole population. For this purpose, we ranked the ILs based on trait values and compared the CAA frequency spectrum in the whole population (WP) and the lower and upper 5th percentile of the phenotype distributions. Lines were considered to belong to the lower or upper tail groups if they were ranked as outliers in at least two trials. The frequency spectrum was built using 5675 CAAs in the introgression population by counting alleles in 9 allele frequency bins.
Allele frequency of all CAA sites was calculated for the whole population and for the introgression lines that ranked in the lower and upper 5% tails of phenotype distribution using vcftools. Differentiation in allele frequency was determined by calculating the fold change (FC) in allele frequency between the introgression lines in the tails and the whole population. SNPs with FC ≥ 2 were considered to be strongly differentiated. For some traits, where the allele frequency FC was less than two across all sites, the threshold was adjusted accordingly.
To determine the relationship between recombination rate gradient and allelic differentiation, we split the chromosome arms into three equal parts. The number of differentiated CAAs within each chromosome segment was counted using the “bedmap” function of BEDOPs tools. The redundant CAA sites showing the high levels of LD were removed using PLINK. We retained only those SNPs that had r2 < 0.5 within the 50-kb window with a step size of 5 kb. The total number of differentiated alleles was aggregated for all traits and chromosomes.
To confirm the contribution of CAAs to adaptation traits, a linear regression of yield on CT was performed to determine the proportion of variance in yield explained by the variation in CT. Introgression lines that ranked in the 5th and 95th percentiles of CT distribution were compared for yield performance relative to the recurrent parents. GWAS was performed on the traits phenotyped in the BC1F3:5 A. tauschii-wheat introgression population including CT, heading date, yield, and component traits to determine loci with significant associations. Multiple GWAS models were tested on each trait with varying number of principal components to correct for population structure. GWAS analysis was implemented in GAPIT v3.0. CAAs significantly associated with traits in the introgression population at the FDR value 0.05 were considered adaptive in the winter wheat background.
Data availability
Whole genome sequencing data for Ae. tauschii accessions and six adapted wheat cultivars is available under NCBI Bioproject PRJNA745927 [75]. GBS data is available from NCBI Bioprojects PRJNA637316 [76] and PRJNA637416 [77]. The phenotypic data could be accessed through the T3 database [78] under trial names: 18_AeTa_Colby_Dryland_Yield_Trials, 18_AeTa_Colby_Irrigated_Yield_Trials, 19_AeTa_Colby_Dryland_Yield_Trials, 19_AeTa_Colby_Irrigated_Yield_Trials, 20_AeTa_Colby_Dryland_Yield_Trials, 20_AeTa_Colby_Irrigated_Yield_Trials, 20_AeTa_Manhattan_Yield_Trials.
References
Sohail Q, Inoue T, Tanaka H, Eltayeb AE, Matsuoka Y, Tsujimoto H. Applicability of Aegilops tauschii drought tolerance traits to breeding of hexaploid wheat. Breed Sci. 2011;61:347–57.
Gill BS, Friebe B, Raupp WJ, Wilson DL, Cox TS, Sears RG, et al. Wheat genetics resource center: the first 25 years. Adv Agron. 2006;89:73–136.
Kishii M. An update of recent use of aegilops species in wheat breeding. Front Plant Sci. 2019;10:585.
Araus JL, Ferrio JP, Buxó R, Voltas J. The historical perspective of dryland agriculture: lessons learned from 10,000 years of wheat cultivation. J Exp Bot. 2007;58:131–45.
Exposito-Alonso M, Burbano HA, Bossdorf O, Nielsen R, Weigel D. Natural selection on the Arabidopsis thaliana genome in present and future climates. Nature. 2019;573:126–9.
Bohra A, Kilian B, Sivasankar S, Caccamo M, Mba C, McCouch SR, et al. Reap the crop wild relatives for breeding future crops. Trends Biotechnol. 2022;40:412–31.
Dvorak J, Luo MC, Yang ZL, Zhang HB. The structure of the Aegilops tauschii genepool and the evolution of hexaploid wheat. Theor Appl Genet. 1998;97:657–70.
Nesbitt M, Samuel D. From staple crop to extinction? The archaeology and history of hulled wheats. In: Padulosi S, Hammer K, Heller J, editors. Int Work Hulled Wheats. Rome: Italy International Plant Genetic Resources Institute; 1996;4:41–100.
Tanno K-I, Willcox G. How fast was wild wheat domesticated? Science. 2006;311:1886.
Luo M-C, Yang Z-L, You FM, Kawahara T, Waines JG, Dvorak J. The structure of wild and domesticated emmer wheat populations, gene flow between them, and the site of emmer domestication. Theor Appl Genet. 2007;114:947–59.
Avni R, Nave M, Barad O, Baruch K, Twardziok SO, Gundlach H, et al. Wild emmer genome architecture and diversity elucidate wheat evolution and domestication. Science. 2017;97:93–7.
Kihara H. Discovery of the DD-analyser, one of the ancestors of Triticum vulgare. Agricuture Hortic. 1944;19:889–90.
Ozkan H, Willcox G, Graner A, Salamini F, Kilian B. Geographic distribution and domestication of wild emmer wheat (Triticum dicoccoides). Genet Resour Crop Evol. 2011;58:11–53.
Balfourier F, Bouchet S, Robert S, DeOliveira R, Rimbert H, Kitt J, et al. Worldwide phylogeography and history of wheat genetic diversity. Sci Adv. 2019;5: eaav0536.
Smith O, Momber G, Bates R, Garwood P, Fitch S, Pallen M, et al. Sedimentary DNA from a submerged site reveals wheat in the British Isles 8000 years ago. Science. 2014;347:998–1001.
Shewry PR. Wheat. J Exp Bot. 2009;60:1537–53.
He F, Pasam R, Shi F, Kant S, Keeble-Gagnere G, Kay P, et al. Exome sequencing highlights the role of wild relative introgression in shaping the adaptive landscape of the wheat genome. Nat Genet. 2019;51:896–904.
Zhao X, Guo Y, Kang L, Yin C, Bi A, Xu D, et al. Population genomics unravels the Holocene history of bread wheat and its relatives. Nat Plants. 2023;9:403–19.
Tack J, Barkley A, Nalley LL. Effect of warming temperatures on US wheat yields. Proc Natl Acad Sci. 2015;112:6931–6.
Schlenker W, Roberts MJ. Nonlinear temperature effects indicate severe damages to U.S. crop yields under climate change. Proc Natl Acad Sci U S A. 2009;106:15594–8.
Zabel F, Müller C, Elliott J, Minoli S, Jägermeyr J, Schneider JM, et al. Large potential for crop production adaptation depends on available future varieties. Glob Chang Biol. 2021;27:3870–82.
Gaurav K, Arora S, Silva P, Sánchez-Martín J, Horsnell R, Gao L, et al. Population genomic analysis of Aegilops tauschii identifies targets for bread wheat improvement. Nat Biotechnol. 2022;40:422–31.
Wang J, Luo M-C, Chen Z, You FM, Wei Y, Zheng Y, et al. Aegilops tauschii single nucleotide polymorphisms shed light on the origins of wheat D-genome genetic diversity and pinpoint the geographic origin of hexaploid wheat. New Phytol. 2013;198:925–37.
Zhou Y, Zhao X, Li Y, Xu J, Bi A, Kang L, et al. Triticum population sequencing provides insights into wheat adaptation. Nat Genet. 2020;52:1412–22.
Przewieslik-Allen AM, Wilkinson PA, Burridge AJ, Winfield MO, Dai X, Beaumont M, et al. The role of gene flow and chromosomal instability in shaping the bread wheat genome. Nat Plants. 2021;7:172–83.
Sharma S, Schulthess AW, Bassi FM, Badaeva ED, Neumann K, Graner A, et al. Introducing beneficial alleles from plant genetic resources into the wheat germplasm. Biology (Basel). 2021;10:1–36.
Nyine M, Adhikari E, Clinesmith M, Jordan KW, Fritz AK, Akhunov E. Genomic patterns of introgression in interspecific populations created by crossing wheat with its wild relative. G3 Genes Genomes Genet. 2020;10:3651–61.
Jones H, Gosman N, Horsnell R, Rose GA, Everest LA, Bentley AR, et al. Strategy for exploiting exotic germplasm using genetic, morphological, and environmental diversity: the Aegilops tauschii Coss. example. Theor Appl Genet. 2013;126:1793–808.
Pestsova E, Borner A, Roder M. Development of a set of Triticum aestivum- Aegilops tauschii intr ogr ession lines. Hereditas. 2001;135:139–43.
Börner A, Ogbonnaya FC, Röder MS, Rasheed A, Periyannan S, Lagudah ES. Aegilops tauschii Introgressions in Wheat. In: Molnár-Láng M, Ceoloni C, Doležel J, editors. Alien Introgression Wheat Cytogenet Mol Biol Genomics. Cham: Springer International Publishing; 2015;245–71.
Singh N, Wu S, Tiwari V, Sehgal S, Raupp J, Wilson D, et al. Genomic analysis confirms population structure and identifies inter-lineage hybrids in Aegilops tauschii. Front Plant Sci. 2019;10:1–13.
Molero G, Coombes B, Joynson R, Pinto F, Piñera-Chávez FJ, Rivera-Amado C, et al. Exotic alleles contribute to heat tolerance in wheat under field conditions. Commun Biol. 2023;6:6.
Rouse MN, Jin Y. Genetics of resistance to race TTKSK of Puccinia graminis f. sp. tritici in Triticum monococcum. Phytopathology. 2011;101:1418–23.
Brisco EI, Brown LK, Olson EL. Fusarium head blight resistance in Aegilops tauschii. Genet Resour Crop Evol. 2017;64:2049–58.
Frankel OH. Genetic perspectives of germplasm conservation. In: Arber WK, Llimensee K, Peacock WJ, Starlinger P, editors. Genet Manip impact man Soc. Cambridge: Cambridge University Press; 1984;161–70.
Bari A, Street K, Mackay M, Endresen DTF, de Pauw E, Amri A. Focused identification of germplasm strategy (FIGS) detects wheat stem rust resistance linked to environmental variables. Genet Resour Crop Evol. 2012;59:1465–81.
Lasky JR, Upadhyaya HD, Ramu P, Deshpande S, Hash CT, Bonnette J, et al. Genome-environment associations in sorghum landraces predict adaptive traits. Sci Adv. 2015;1:e1400218–e1400218.
Hancock AM, Brachi B, Faure N, Horton MW, Jarymowycz LB, Sperone FG, et al. Adaptation to climate across the Arabidopsis thaliana genome. Science. 2011;334:83–6.
Turner TL, Bourne EC, Von Wettberg EJ, Hu TT, Nuzhdin SV. Population resequencing reveals local adaptation of Arabidopsis lyrata to serpentine soils. Nat Genet. 2010;42:260–3.
Frachon L, Bartoli C, Carrère S, Bouchez O, Chaubet A, Gautier M, et al. A genomic map of climate adaptation in arabidopsis thaliana at a micro-geographic scale. Front Plant Sci. 2018;9:1–15.
Martins K, Gugger PF, Llanderal-Mendoza J, González-Rodríguez A, Fitz-Gibbon ST, Zhao JL, et al. Landscape genomics provides evidence of climate-associated genetic variation in Mexican populations of Quercus rugosa. Evol Appl. 2018;11:1842–58.
van den Wollenberg AL. Redundancy analysis an alternative for canonical correlation analysis. Psychometrika. 1977;42:207–19.
McArdle BH, Anderson MJ. FITTING MULTIVARIATE MODELS TO COMMUNITY DATA: A COMMENT ON DISTANCE-BASED REDUNDANCY ANALYSIS. Ecology. 2001;82:290–7.
Blanchet FG, Legendre P, Borcard D. FORWARD SELECTION OF EXPLANATORY VARIABLES. Ecology. 2008;89:2623–32.
Wallace JG, Zhang X, Beyene Y, Semagn K, Olsen M, Prasanna BM, et al. Genome-wide association for plant height and flowering time across 15 tropical maize populations under managed drought stress and well-watered conditions in Sub-Saharan Africa. Crop Sci. 2016;56:2365–78.
Cingolani P, Platts A, Wang LLL, Coon M, Nguyen T, Wang LLL, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6:80–92.
Nyine M, Adhikari E, Clinesmith M, Aiken R, Betzen B, Wang W, et al. The Haplotype-Based Analysis of Aegilops tauschii Introgression Into Hard Red Winter Wheat and Its Impact on Productivity Traits. Front Plant Sci. 2021;12:1694.
Hill W, Robertson A. The effect of linkage on limits to artificial selection. Genet Res. 1966;8:269–94.
Kuzay S, Xu Y, Zhang J, Katz A, Pearce S, Su Z, et al. Identification of a candidate gene for a QTL for spikelet number per spike on wheat chromosome arm 7AL by high-resolution genetic mapping. Theor Appl Genet. 2019;132: https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s00122-019-03382–5.
Zhang J, Abate S, Eligio G, Joshua B, Tyson H, Arron H, et al. Identification and validation of QTL for grain yield and plant water status under contrasting water treatments in fall - sown spring wheats. Theor Appl Genet. 2018;131:1741–59.
Guo Z, Chen D, Röder MS, Ganal MW, Schnurbusch T. Genetic dissection of pre-anthesis sub-phase durations during the reproductive spike development of wheat. Plant J. 2018;95:909–18.
Still CJ, Rastogi B, Page GFM, Griffith DM, Sibley A, Schulze M, et al. Imaging canopy temperature: shedding (thermal) light on ecosystem processes. New Phytol. 2021;230:1746–53.
Kumar M, Govindasamy V, Rane J, Singh AK, Choudhary RL, Raina SK, et al. Canopy temperature depression (CTD) and canopy greenness associated with variation in seed yield of soybean genotypes grown in semi-arid environment. South African J Bot. 2017;113:230–8.
Pinto RS, Reynolds MP. Common genetic basis for canopy temperature depression under heat and drought stress associated with optimized root distribution in bread wheat. Theor Appl Genet. 2015;128:575–85.
Pinter PJ, Zipoli G, Reginato RJ, Jackson RD, Idso SB, Hohman JP. Canopy temperature as an indicator of differential water use and yield performance among wheat cultivars. Agric Water Manag. 1990;18:35–48.
Amani I, Fischer RA, Reynolds MP. Canopy Temperature Depression Association with Yield of Irrigated Spring Wheat Cultivars in a Hot Climate. J Agron Crop Sci. 1996;176:119–29.
Sultana SR, Ali A, Ahmad A, Mubeen M, Zia-Ul-Haq M, Ahmad S, et al. Normalized difference vegetation index as a tool for wheat yield estimation: A case study from Faisalabad, Pakistan. Sci World J. 2014;2014:1–8.
Wardlaw IF, Dawson IA, Munibi P, Fewster R. The tolerance of wheat to high temperatures during reproductive growth. I. Survey procedures and general response patterns. Aust J Agric Res. 1989;40:1–13.
Lipka AE, Tian F, Wang Q, Peiffer J, Li M, Bradbury PJ, et al. GAPIT: genome association and prediction integrated tool. Bioinformatics. 2012;28:2397–9.
Wang J, Zhang Z. GAPIT Version 3: Boosting Power and Accuracy for Genomic Association and Prediction. Genomics, Proteomics Bioinforma. 2021;19:629–40.
Lasky JR, Des Marais DL, McKay JK, Richards JH, Juenger TE, Keitt TH. Characterizing genomic variation of Arabidopsis thaliana: the roles of geography and climate. Mol Ecol. 2012;21:5512–29.
Cavalet-Giorsa E, González-Muñoz A, Athiyannan N, Holden S, Salhi A, Gardener C, et al. Origin and evolution of the bread wheat D genome. Nature. 2024;633:848–55.
Rosyara U, Kishii M, Payne T, Sansaloni CP, Singh RP, Braun HJ, et al. Genetic Contribution of Synthetic Hexaploid Wheat to CIMMYT’s Spring Bread Wheat Breeding Germplasm. Sci Rep. 2019;9:1–11.
Anderson JT, Song BH. Plant adaptation to climate change—Where are we? J Syst Evol. 2020;58:533–45.
Jackson RD, Idso SB, Reginato RJ, Pinter PJ. Canopy temperature as a crop water stress indicator. Water Resour Res. 1981;17:1133–8.
Mascher M, Schreiber M, Scholz U, Graner A, Reif JC, Stein N. Genebank genomics bridges the gap between the conservation of crop diversity and plant breeding. Nat Genet. 2019;51:1076–81.
Brunazzi A, Scaglione D, Talini RF, Miculan M, Magni F, Poland J, et al. Molecular diversity and landscape genomics of the crop wild relative Triticum urartu across the Fertile Crescent. Plant J. 2018;94:670–84.
Saintenac C, Jiang D, Wang S, Akhunov E. Sequence-based mapping of the polyploid wheat genome. G3 (Bethesda). 2013;3:1105–14.
Poland JA, Brown PJ, Sorrells ME, Jannink J-L. Development of high-density genetic maps for barley and wheat using a novel two-enzyme genotyping-by-sequencing approach. PLoS ONE. 2012;7: e32253.
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.
Glaubitz JC, Casstevens TM, Lu F, Harriman J, Elshire RJ, Sun Q, et al. TASSEL-GBS: A High Capacity Genotyping by Sequencing Analysis Pipeline. PLoS ONE. 2014;9: e90346.
Browning BL, Browning SR. Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics. 2013;194:459–71.
Caye K, Deist TM, Martins H, Michel O, François O. TESS3: Fast inference of spatial population structure and genome scans for selection. Mol Ecol Resour. 2016;16:540–8.
Aparicio J, Gezan SA, Ariza-Suarez D, Raatz B, Diaz S, Heilman-Morales A, et al. Mr.Bean: a comprehensive statistical and visualization application for modeling agricultural field trials data. Front. Plant Sci. 2023;14:1–13.
Nyine M, Adhikari E, Clinesmith M, Aiken R, Betzen B, Wang W, Davidson D, Yu Z, Guo Y, He F, Akhunova A, Jordan KW, Fritz AK, Akhunov E. Whole genome sequence reads for Ae. tauschii introgression population parents. 2021. https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA745927.
Nyine M, Adhikari E, Clinesmith M, A. Fritz, Akhunova A, Akhunov E. Ae. tauschii introgression patterns in hexaploid wheat. 2020. https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA637316.
Nyine M, Adhikari E, Clinesmith M, A. Fritz, Akhunova A, Akhunov E. Sequencing of eco-geographically diverse Ae. tauschii accessions. 2020. https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA637416.
Nyine M, Davidson D, Adhikari E, Clinesmith M, Wang H, Akhunova A, Fritz A, Akhunov E. T3/Wheat Database. 2024. https://wheat.triticeaetoolbox.org/
Funding
This research was supported by the Agriculture and Food Research Initiative Competitive Grants 2022–68013-36439 (WheatCAP) and 2020–68013-30905 (USDA NIFA WWBI Hub) from the USDA National Institute of Food and Agriculture and Bill and Melinda Gates Foundation grant INV-004430.
Author information
Authors and Affiliations
Contributions
MN—data collection, analysis and interpretation, manuscript writing; DD—field data collection, UAS based phenotyping; EAd—population development, data analysis and field data collection; MC—population development and field data collection; HW—UAS imaging data analysis; AA—next-generation sequencing; AF—population development, data collection, experimental design; EAk—data analysis and interpretation, conceptualization, conceiving idea, manuscript writing. All authors read and approved the final manuscript.
Peer review information
Wenjing She was the primary editor of this article and managed its editorial process and peer review in collaboration with the rest of the editorial team. The peer-review history is available in the online version of this article.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
Eduard Akhunov is a Guest Editor for Genomics for crop improvement collection of Genome Biology but was not involved in the editorial process of this manuscript. Elina Adhikari is an employee at Bayer (Chesterfield, USA), and Marshall Clinesmith is an employee at Syngenta, (Junction City, USA).
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
13059_2025_3500_MOESM1_ESM.csv
Additional File 1: Table S1. Average historical climatic and bioclimatic data based on the geographic origin of the diverse Ae. tauschii.
13059_2025_3500_MOESM2_ESM.docx
Additional File 2: Figs S1-S4. Fig. S1. The effect of geographic, climatic and bioclimatic variables on the SNP diversity in Ae. tauschii population. Fig. S2. Relationship between the number of days to heading (DTH) and yield, and the genomic loci associated with days to heading of Ae. tauschii introgression lines. Fig. S3. Manhattan plots showing CAAs on chromosomes 1D, 2D, 6D and 7D significantly associated with yield and yield component traits in the Ae. tauschii introgression population. Fig. S4. Importance of canopy temperature and yield component traits for yield prediction based on random forest model in Ae. tauschii-wheat introgression population.
13059_2025_3500_MOESM3_ESM.docx
Additional File 3: Table S2. Number of climate associated SNPs for ecogeographic variables discovered through redundancy analysis and genome-wide association analysis in Ae. tauschii population.
13059_2025_3500_MOESM4_ESM.xlsx
Additional File 4: Table S3. SNPs showing the highest correlation with ecogeographic variables in Ae. tauschii population.
13059_2025_3500_MOESM5_ESM.xlsx
Additional File 5: Table S4. Effect of SNPs that are significantly associated with ecogeographic variables in Ae. tauschii population based on genome-wide association analysis.
13059_2025_3500_MOESM6_ESM.xlsx
Additional File 6: Table S5. SNPs identified by redundancy analysis (RDA) and/or genome-wide association (GWAS) to be involved in Ae. tauschii ecogeographic adaptation.
13059_2025_3500_MOESM7_ESM.xlsx
Additional File 7: Table S6. High confidence climate associated SNPs identified by both redundancy analysis (RDA) and genome-wide association analysis (GWAS) in Ae. tauschii population.
13059_2025_3500_MOESM8_ESM.docx
Additional File 8: Table S7. SNPs significantly associated with days to heading, yield and component traits in Ae. tasuchii introgression population.
13059_2025_3500_MOESM9_ESM.xlsx
Additional File 9: Table S8. SNPs significantly associated with canopy temperature in Ae. tauschii introgression population.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Nyine, M., Davidson, D., Adhikari, E. et al. Genomic signals of ecogeographic adaptation in a wild relative are associated with improved wheat performance under drought stress. Genome Biol 26, 35 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13059-025-03500-1
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13059-025-03500-1