- Research
- Open access
- Published:
Cohesin distribution alone predicts chromatin organization in yeast via conserved-current loop extrusion
Genome Biology volume 25, Article number: 293 (2024)
Abstract
Background
Inhomogeneous patterns of chromatin-chromatin contacts within 10–100-kb-sized regions of the genome are a generic feature of chromatin spatial organization. These features, termed topologically associating domains (TADs), have led to the loop extrusion factor (LEF) model. Currently, our ability to model TADs relies on the observation that in vertebrates TAD boundaries are correlated with DNA sequences that bind CTCF, which therefore is inferred to block loop extrusion. However, although TADs feature prominently in their Hi-C maps, non-vertebrate eukaryotes either do not express CTCF or show few TAD boundaries that correlate with CTCF sites. In all of these organisms, the counterparts of CTCF remain unknown, frustrating comparisons between Hi-C data and simulations.
Results
To extend the LEF model across the tree of life, here, we propose the conserved-current loop extrusion (CCLE) model that interprets loop-extruding cohesin as a nearly conserved probability current. From cohesin ChIP-seq data alone, we derive a position-dependent loop extrusion rate, allowing for a modified paradigm for loop extrusion, that goes beyond solely localized barriers to also include loop extrusion rates that vary continuously. We show that CCLE accurately predicts the TAD-scale Hi-C maps of interphase Schizosaccharomyces pombe, as well as those of meiotic and mitotic Saccharomyces cerevisiae, demonstrating its utility in organisms lacking CTCF.
Conclusions
The success of CCLE in yeasts suggests that loop extrusion by cohesin is indeed the primary mechanism underlying TADs in these systems. CCLE allows us to obtain loop extrusion parameters such as the LEF density and processivity, which compare well to independent estimates.
Background
Our knowledge of chromatin architecture has been transformed by sequencing-based chromatin-capture (Hi-C) techniques, which provide quantitative metrics of relative population-averaged contact probability between all pairs of genomic loci [1,2,3,4,5,6,7,8,9,10]. Hi-C data, typically presented as a contact map, together with theory and modeling have led to a new understanding of chromatin organization, based on three coexisting mechanisms that operate on largely different length scales: (1) on several-megabase scales, “checkerboard patterns” in Hi-C maps, encompassing distant contacts on the same chromosome and contacts on different chromosomes, have led to a block co-polymer-inspired picture of chromatin compartments, comprised of several types of epigenetically distinguished heterochromatin and euchromatin, which each exhibits preferential affinity for like-regions [11,12,13,14,15,16,17,18,19,20,21,22,23,24]. (2) At smaller length scales―tens of kilobases―overlapping squares of high contact probability within definite regions of the same chromosome, termed topologically associating domains (TADs), have led to the loop extrusion factor (LEF) model [25,26,27,28,29,30,31,32,33], in which LEFs first bind to the chromatin polymer, and then initiate ATP-dependent loop extrusion by moving their two anchor points away from each other. Loop extrusion at an anchor stalls when the anchor encounters either another anchor or a so-called boundary element (BE). LEFs can also dissociate from chromatin. These processes collectively give rise to a dynamic steady-state of chromatin loops, in turn leading to TADs in Hi-C maps [26, 27, 30, 34]. (3) At few kilobase scales, high-resolution Hi-C experiments reveal patterns of contacts [35, 36] that can be explained based on nucleosomal structures [37,38,39].
Depletion of cohesin, a member of the structural maintenance of chromosomes (SMC) complex family, leads to the disappearance both of TADs and of their accompanying enhanced contact probabilities across a variety of species, including human [40, 41], fission yeast [6], and budding yeast [42], thus identifying cohesin as the predominant LEF. Bolstering cohesin’s LEF identity, single-molecule experiments show that cohesin possesses ATP-dependent loop extrusion activity on DNA in vitro [43,44,45,46]. In vertebrates, the locations of TAD boundaries show a strong correlation with binding sites of the DNA-binding protein, CTCF [47, 48], while depletion of CTCF, causes loss of most TAD boundaries [41, 47]. These observations suggest that CTCF is the most important BE in vertebrates. Indeed, loop extrusion simulations using boundary elements, whose locations are defined by peaks in the CTCF chromatin immunoprecipitation sequencing (ChIP-seq) signal, are able to recapitulate many aspects of experimental vertebrate Hi-C maps [26, 27, 30, 34].
However, although TADs feature prominently in Hi-C maps across the tree of life, many non-vertebrate organisms either do not express CTCF orthologs, including yeasts (Schizosaccharomyces pombe and Saccharomyces cerevisiae [49]), plants (Arabidopsis thaliana [50] and Oryza sativa [51]), and Caenorhabditis elegans [52], or show only a limited number of TAD boundaries that correlate with CTCF binding sites, as in the case of Drosophila melanogaster [53]. In all of these organisms, even if the LEF model is applicable, which remains uncertain, the identities of the boundary elements are unknown, frustrating quantitative comparisons between Hi-C data and simulations.
With the goal of modeling TADs across the tree of life beyond vertebrates, here, we introduce a novel, physics-based version of the LEF model, named the conserved-current loop extrusion (CCLE) model, that should be applicable in any organism, in which loop extrusion is a major driver of TAD formation. Specifically, by interpreting loop-extruding LEFs as a probability current, that is approximately conserved at steady-state, we derive a position-dependent loop extrusion rate, using cohesin ChIP-seq data as input, which we then incorporate into loop extrusion simulations without explicit boundary elements. This model has intuitive appeal in that loop extrusion rates are small at genomic locations with high cohesin ChIP-seq signal, as if the LEFs are blocked there, while the rates are high at positions with low cohesin ChIP-seq signal, because LEFs spend little time in locations where they are not blocked. By design, CCLE is agnostic concerning the identities of BEs and other proteins that interact with cohesin. Indeed, CCLE allows for the boundary element concept to be extended, beyond localized barriers to loop extrusion, to include more widely distributed variations in the loop extrusion rate, that may occur in response to chromatin composition, for example.
A key inspiration for the development of CCLE was Reference [54], which successfully simulates chromatin organization in meiotic S. cerevisiae, using a version of the previous vertebrate-focused LEF models, except with cohesin-binding sites replacing CTCF binding sites. CCLE improves on this approach in principle, by eliminating both (1) the need to specify how to pick out cohesin-binding sites from ChIP-seq data and (2) the need to specify how such binding sites affect loop extrusion, both of which are accomplished in an ad hoc fashion in Ref. [54] (and its CTCF-based precursors [26, 27, 30, 34] regarding CTCF binding sites). Limitation (1) ignores the low-contrast cohesin peaks that could represent a weak yet essential blocking effect by barriers to loop extrusion due to more elaborate chromatin composition, while limitation (2) hinders the model’s ability to self-consistently reproduce the cohesin ChIP-seq data, thus rendering the model a weak physical basis. The CCLE model eliminates both limitations in principle.
To focus on the role of loop extrusion and avoid possible ambiguities associated with chromatin compartments, in this paper, we apply our model to meiotic budding yeast, following Ref. [54], mitotic budding yeast, and interphase fission yeast. None of the corresponding Hi-C maps shows a checkerboard pattern, characteristic of chromatin compartments. Using cohesin ChIP-seq data as input, CCLE seeks to describe the measured Hi-C maps of S. pombe quantitatively, with just four fitting parameters, namely the LEF processivity in the absence of obstructions, the chromatin persistence length, the linear density of loop-extruding cohesins (\(\rho\)), and the linear density of cohesive cohesins (\(\rho _c\)). For meiotic and mitotic S. cerevisiae, we use a fifth parameter to empirically account for the increased polymer volume exclusion in the more compact meiotic and mitotic chromosomes. Using this approach, we demonstrate that CCLE achieves excellent experiment-simulation agreement in all three cases on the 10–100-kb scales, despite major differences in their Hi-C features. Thus, CCLE transforms cohesin ChIP-seq data into an ensemble of fluctuating loop configurations that define three-dimensional chromosomal organization. Since CCLE does not incorporate genomic data on nucleosome positioning, it does not describe the kilobase-scale features in high-resolution Hi-C maps that originate from nucleosomes [37,38,39]. CCLE also provides corresponding values for loop extrusion parameters, such as the LEF density and processivity, in each case.
Results
Conserved-current loop extrusion (CCLE) model enables calculation of chromatin loop configurations from genomic distribution of LEF
In order to develop our approach, we envision loop extrusion as giving rise to probability currents of LEF anchors, flowing through chromatin lattice sites. Assuming two-sided loop extrusion, we are led to the following coarse-grained master equations for the probabilities, \(R_n\) and \(L_n\), that chromatin site n is occupied by the right-moving or left-moving anchor of a LEF:
and
where \(V_n\) is the rate at which right-moving LEF anchors step from site n to site \(n+1\), \(U_n\) is the rate at which left-moving LEF anchors step from site n to site \(n-1\), and \(P_n = R_n + L_n\) is the probability that site n is occupied by either a left- or right-moving LEF anchor. The first and second terms on the right hand sides of each of Eqs. 1 and 2 correspond to the current―i.e., the number per second―of LEF anchors to and from, respectively, site n by loop extrusion along the chromatin, while \(A_n\) and \(D_n\) (\(a_n\) and \(d_n\)) are the association and dissociation currents of right-moving (left-moving) LEF anchors at site n, respectively.
At steady state, assuming the difference between LEF binding and unbinding terms is small compared to the loop-extruding terms, and that the mean probabilities of right- and left-moving LEF anchors being at site n are equal (i.e., \(\left\langle R_n \right\rangle = \left\langle L_n \right\rangle = \frac{1}{2} \left\langle P_n \right\rangle\)), and neglecting correlations among the anchor probabilities, Eqs. 1 and 2 lead to (Additional File 1: Methods)
and
where \(\rho\) is the mean LEF density in units of kb\(^{-1}\), L is the mean LEF processivity in units of kb, and \(\tau\) is the mean LEF lifetime, estimated to be of order \(10^3\) s in Ref. [31]. The ratio, \(L/\tau\), can be interpreted as the loop extrusion rate of an isolated LEF along the chromatin polymer, i.e., twice the extrusion rate of an isolated LEF anchor.
The utility of Eqs. 3 and 4 becomes apparent, when we realize that cohesin ChIP-seq data specifies the n-dependence of \(\left\langle P_n \right\rangle\), assuming that cohesin is the LEF in question. Thus, we can use cohesin ChIP-seq to calculate the position-dependent loop extrusion rates. These rates are subsequently used as input for Gillespie-type loop extrusion simulations (the “Methods” section) that implement loop extrusion, LEF association and dissociation, and mutual LEF blocking, as for the existing LEF models, but for which there are no explicit boundary elements. These simulations generate time-dependent loop configurations, that we average over time and over multiple independent simulations to calculate contact probability maps (the “Methods” section). Our approach neatly sidesteps needing to know the identities and positions of BEs by exploiting the fact that the effect of BEs is encoded in \(\left\langle P_n \right\rangle\) and, therefore, is incorporated into the loop extrusion rates via Eqs. 3 and 4.
Functionally, there are two populations of cohesin complexes, namely trans-acting, cohesive cohesins, which give rise to cohesion between sister chromatids, and cis-acting, loop-extruding cohesins [32]. For S. pombe, both types are present in interphase and contribute to cohesin ChIP-seq. To determine \(P_n\), which is the probability that chromatin site n is occupied by loop-extruding cohesin, we assume that cohesive cohesin is randomly loaded along chromatin, giving rise to an n-independent contribution to the cohesin ChIP-seq, which we describe with a fitting parameter, \(\rho _{c}\), that represents the uniform density of cohesive cohesin.
Finally, to compare experimental and simulated Hi-C maps, we incorporate the polymer physics of self-contacts, within the confined volume of the nucleus, into simulated loop configurations, using a simple, albeit approximate, analytic approach, described in the “Methods” section.
CCLE quantitatively describes TADs and loop configurations in interphase S. pombe
CCLE simulations accurately reproduce experimental interphase S. pombe Hi-C maps
As a first application of CCLE, we sought to describe the interphase fission yeast Hi-C map from Ref. [55], on the basis of ChIP-seq data of the protein, Psc3, which is a component of the cohesin core complex, also from Ref. [55]. The right-hand side of Fig. 1A depicts a 1-Mb portion of the Hi-C map of S. pombe’s Chr 2 starting 0.3 Mb from the end of the left arm and extending to 1.3 Mb from the end of the left arm, using a logarithmic false-color intensity scale. The maximum interaction distance shown is 120 kb. Each pixel in the experimental Hi-C maps corresponds to 10 kb. The original resolution of the simulation is 1 kb, which is binned to 10 kb to match the experimental resolution. In comparison, the left-hand side of Fig. 1A presents the corresponding conserved-current simulated Hi-C map using the best-fit parameters (Table 2). Figure 1B magnifies the experiment-simulation comparison for three representative sub-regions, each 150 kb in size. In both Fig. 1A and B, visual inspection immediately reveals a high degree of left-right symmetry, corresponding to excellent agreement between the experimental and simulated Hi-C maps. Clearly, our simulations well reproduce the experimentally observed pattern of overlapping squares. By contrast, although the left- and right-hand sides of Fig. 1C and D look generally similar, clearly there is no left-right symmetry in these figures, which compare the experimental Hi-C maps of two non-overlapping regions of Chr 2 with each other, and which therefore are expected to be dissimilar. In comparison to other published comparisons between Hi-C experiments and simulations at the TAD scale [15, 21, 23, 26, 27, 40, 54], by eye, we judge the agreement displayed in Fig. 1A and B to be comparable or superior. To quantitatively compare the simulated and experimental contact maps, we examined the ratio of each pair of compared Hi-C maps in logarithmic scale, plotted as Fig. 1E and F, which further illustrate a good agreement between experiment and simulation of the same region and a mismatch between non-overlapping regions. Each pixel in the ratio maps shown in Fig. 1E and F is the ratio between the corresponding pixels, \(E_n\) and \(S_n\), whichever is larger, from the two compared Hi-C maps, respectively, i.e., \(e^{|\log (E_n/S_n)|}\). Therefore, all pixels in the ratio maps are always greater or equal to 1. Overall, Fig. 1E is lightly shaded, indicating relatively small discrepancies overall between the simulation and experiment of Fig. 1A and B. By contrast, Fig. 1F contains many more darker pixels, indicating relatively large differences between the experimental Hi-C maps of different genomic regions.
Conserved-current loop extrusion (CCLE) model recapitulates TAD-scale chromatin organization in interphase S. pombe using cohesin ChIP-seq data. A Comparison between Hi-C map of 1 Mb region, generated by the CCLE model (using cohesin ChIP-seq data from Ref. [55]), and the experimental Hi-C map [55] of the same region. B, Magnified comparisons of experimental and simulated Hi-C from panel A for three representative sub-regions, each 150 kb in size, located at 0.39–0.54 Mb, 0.83–0.98 Mb, and 1.04–1.19 Mb, from top to bottom. C Magnified comparisons of two experimental Hi-C from panel D. D Comparison between two experimental Hi-C maps [55] from two different regions. All Hi-C maps show interactions up to a genomic separation of 120 kb. E Contact probability ratio maps between the simulated and experimental Hi-C shown in panel A. Each pixel represents the ratio of contact probabilities of the corresponding pixels of the simulated and experimental Hi-C shown in panel A. F Contact probability ratio maps between the two experimental Hi-C maps shown in panel D. Pixels with darker shades indicate a poorer agreement than pixels with lighter shades. All maps are displayed in log-scale. The simulated Hi-C map is generated by the CCLE model using the best-fit parameters given in Table 2. The optimization process is discussed in Additional File 1: Methods
Overall, we have simulated Hi-C maps for a total of five 1.2-Mb-sized genomic regions (only the middle 1 Mb regions are displayed to avoid any possible distortions from the simulation boundaries), one from each chromosome arm that exceeds 1.2 Mb in length (the left arm of chromosome 3 was omitted, since it is 1.1 Mb long). These five regions together span more than one-third of the entire fission yeast genome. In every case, we achieve good agreement between the experimental and simulated Hi-C maps (Fig. 1 and Additional File 1: Figs. S1–4).
Figures S5–9 (Additional File 1) also present the comparisons between the CCLE-predicted Hi-C maps and the newer Micro-C data for S. pombe from Hsieh et al. [56], both binned at a higher resolution of 2 kb. Since there are a reduced number of counts in each genomic pixel at 2 kb-resolution, the experimental contact map tends to be relatively noisy, which explains the relatively high MPR and low PCC values (given in the figure captions) compared to the Hi-C comparisons displayed at 10 kb-resolution. Nevertheless, the comparisons at 2 kb-resolution reveal clear left-right symmetry as well, indicating that CCLE also accurately describes the experimental contact maps at this higher resolution. Also included in Figures S5–9 (Additional File 1) are comparisons between the original Hi-C maps of Mizuguchi et al. and the newer maps of Hsieh et al., binned to 10 kb-resolution. Inspection of these comparisons does not reveal any major new features in the interphase fission yeast Hi-C map on the 10–100 kb scale that were not already apparent from the Hi-C maps of Mizuguchi et al.
An additional, commonly applied way to compare Hi-C experiments and simulations is to examine the mean contact probability, P(s), for loci with genomic separation, s. Figure 2 displays five P(s)-versus-s curves, each corresponding to one of the five regions simulated. Experimental and simulated mean contact probabilities are shown as the open circles and solid lines, respectively. Evidently, the experimental mean contact probabilities are very similar for different genomic regions, except for large separation contacts in the 1.2–2.4 Mb region of Chr 3, whose probability exceeds that of the other regions by about 15%. For every region, the simulated mean contact probability agrees well with its experimental counterpart, throughout the range of genomic separations studied (20 kb–500 kb), bolstering a posteriori the simple, analytic approach to polymer self-contacts, described in the “Methods” section.
CCLE reproduces P(s) curves of experimental Hi-C contact maps across the entire S. pombe genome. Chromatin mean contact probabilities, P(s), as a function of genomic separation, s, are plotted in circles for five 1.2 Mb genomic regions, using experimental Hi-C contact map data from ref. [55]. The corresponding P(s) curves of CCLE-simulated Hi-C contact maps are plotted as lines. Different colors represent different regions: yellow, 0.5–1.7 Mb of Chr 1; purple, 4.2–5.4 Mb of Chr 1; red, 0.2–1.4 Mb of Chr 2; green, 1.8–3.0 Mb of Chr 2; cyan, 1.2–2.4 Mb of Chr 3. The vertical gray dashed line indicates the maximum genomic separation displayed in the Hi-C comparison maps and ratio maps shown in Fig. 1
To achieve the good agreement between simulation and experiment, evident in Figs. 1A, B and 2, we chose to quantify and minimize discrepancies between experimental and simulated Hi-C maps, similarly to Refs. [27, 54], using an objective function, that can be interpreted as the mean pairwise ratio (MPR) of experimental and simulated pixels, whichever is larger, averaged over all pixels:
where \(E_n\) and \(S_n\) are the experimental and simulated contact probabilities, respectively, for pixel n, and N is the total number of pixels in the map of Fig. 1A. Note that the MPR value of the two compared maps is exactly the mean of all pixel values of their ratio map. A value of MPR closer to unity indicates better agreement. The MPR between the simulated and experimental Hi-C maps of Fig. 1A is 1.0911 (Table 1), while for the experimental Hi-C maps from two different regions (Fig. 1D), it is 1.1965 (Additional File 1: Table S1), showing a deviation from unity more than twice as large.
In addition to the MPR, we also calculated the P(s)-scaled Pearson correlation coefficient (PCC) for pairs of Hi-C maps (Table 1 and Additional File 1: Table S1). The PCC for the simulated and experimental Hi-C maps of Fig. 1A is 0.6728, indicating a high correlation between the two maps. In contrast, the PCC for the experimental Hi-C maps shown in Fig. 1D, corresponding to non-overlapping genomic regions, is −0.0443 (Additional File 1: Table S1), indicating no correlation, as expected for the comparison between two different regions. For all five regions studied, the MPR scores and PCCs correspond to distinctly smaller differences overall and significantly greater correlation, respectively, between experimental and simulated Hi-C maps (Table 1) than between different experimental Hi-C maps (Additional File 1: Table S1), demonstrating that the CCLE model is able to predict and reproduce TAD patterns in S. pombe, based solely on cohesin ChIP-seq data.
To obtain the best agreement between simulation and experiment, we optimized the four model parameters, namely, LEF density (\(\rho\)), LEF processivity (L), cohesive cohesin density (\(\rho _c\)), and chromatin persistence length, by minimizing Eq. 5, as described in more detail in Additional File 1: Methods. For each simulated region, the model parameters were optimized independently to their best-fit values, which are presented in Table 2. The optimized values of the LEF density and the LEF processivity are similar and have small errors, suggesting that the best-fit values of these parameters are robust and that cohesin-driven loop extrusion processes are essentially uniform across the S. pombe genome, at least at a resolution of 10 kb. The best-fit LEF density of 0.033 kb\(^{-1}\) corresponds to a cellular copy number of about 400 loop-extruding cohesin complexes per cell. If we also take the best-fit “cohesive cohesin density” parameter at face value, then there are approximately an additional 400 cohesive cohesins per cell, that is, 800 cohesins in total, which may be compared to the copy numbers from Ref. [57] of the constituents of the cohesin core complex, namely Psc3, Psm1, Psm3, and Rad21, of 723, 664, 1280, 173, respectively (mean 710). The best-fit values of the chromatin persistence length are also similar for different genomic regions, reflecting the similar behavior of the mean contact probability, P(s), versus genomic separation, s, for different genomic regions (Fig. 2). The best-fit values of the chromatin persistence length, which lie in the 75–95 nm range, may be compared to the value of 70 nm, reported in Ref. [58]. The relatively larger errors in the best-fit values of persistence length reflect the fact that our polymer model is only sensitive to the persistence length for large genomic separations (Additional File 1: Methods, Eq. S36 and Fig. S10). The best-fit values of the cohesive cohesin density show a greater variation for different genomic regions than the other parameters, hinting that the level of sister chromatid cohesion may vary across the genome. Alternatively, however, the apparent variation in this parameter could reflect different Psc3 ChIP-seq background levels or shifts for different regions (the “Methods” section).
In addition to cohesin, we also tested whether condensin, which is also a member of the SMC complex family shown to extrude DNA loops in vitro [59], could be another LEF that determines chromatin spatial configurations in S. pombe. However, when we carry out CCLE simulations using the ChIP-seq signal of interphase S. pombe condensin [60], the resultant best-fit simulated Hi-C map shows poor agreement with experiment (Additional File 1: Fig. S11 and Table 1), reinforcing that cohesin predominantly determines TAD-scale chromatin organization in interphase S. pombe.
CCLE self-consistently reproduces cohesin ChIP-seq data
In addition to predicting chromatin organization as measured by Hi-C maps and mean contact probability curves, loop extrusion simulations simultaneously yield position-dependent LEF occupancy probabilities. For a typical 200-kb-sized region of S. pombe’s Chr 2, Fig. 3A compares the simulated time- and population-averaged probability that a chromatin lattice site is occupied by a LEF (red curve) to the corresponding experimental ChIP-seq data for Psc3 [55] (blue curve), converted to occupancy probability, as described in the “Methods” section. Evidently, the simulated LEF occupancy probability matches the experimental Psc3 occupancy probability well. Indeed, as shown in Fig. 3B, the cross-correlation of experimental and simulated occupancy probability is nearly 0.7. In both cases, a number of peaks are apparent, extending above background to about twice, or less than twice, background. In the context of CCLE, this relatively weak contrast in occupancy probability gives rise to a corresponding relative lack of contrast in S. pombe’s interphase Hi-C pattern.
CCLE reproduces experimental cohesin occupancy landscape in S. pombe. A Comparison between the experimental cohesin occupancy probability landscape [55] (blue) and the simulated LEF occupancy probability landscape by CCLE (red) in the 450–650-kb region of Chr 2 of interphase S. pombe. The occupancy probability curves are normalized by the corresponding optimized LEF density of 0.033 kb\(^{-1}\). B Cross-correlation between the experimental cohesin occupancy probability landscape and the simulated LEF occupancy probability landscape by CCLE, as a function of relative genomic shift
While unsurprising, given that we set the position-dependent loop extrusion rates using the Psc3 ChIP-seq data, the good agreement between experimental and simulated occupancy probabilities implies that our simulations are self-consistent and that the assumptions, leading to Eqs. 3 and 4, are valid for the parameters that are determined to best-describe the Hi-C contact map. The extent to which our best-fit simulations satisfy the assumptions, underlying CCLE, can be further assessed by examining the simulated distributions of the following dimensionless quantities, all of which should be small when CCLE is applicable:
which specifies the fractional imbalance in the numbers of left- and right-moving LEF anchors at each lattice site;
which is the ratio of the net current of right moving LEF anchors at each site, that violates current conservation, as a result of binding or unbinding, to the mean current of right-moving anchors, that satisfies current conservation; and
which is the corresponding conserved-current violation ratio for left-moving LEF anchors. As shown in Additional File 1: Fig. S12, all three of these quantities are distributed around zero with only small excursions (standard deviations \(\sim 0.1\)), consistent with the assumptions, leading to Eqs. 3 and 4. While loop extrusion in interphase S. pombe seems to well satisfy the assumptions underlying CCLE, this may not always be the case.
Loop configurations in interphase S. pombe
The good agreement between our simulated Hi-C maps and experimental Hi-C maps suggests that the corresponding simulated loop configurations are realistic of loop configurations in live S. pombe. Figure 4A shows three representative simulated loop configurations for a 1.2-Mb region of Chr 2, corresponding to the best-fit parameters. In this figure, as in Ref. [61], each loop is represented as a semicircle connecting the genomic locations of the two LEF anchor points. Because the model does not permit LEF anchors to pass each other, correspondingly semicircles never cross, although they frequently contact each other and nest, as is apparent for the configurations in Fig. 4A. The distributions of loop sizes for all five regions simulated in S. pombe are presented in Fig. 4B. Evidently, the loop size distributions are similar for all five regions with an overall mean and standard deviation of 22.1 kb and 19.5 kb, respectively. The mean loop size may be compared to the number of base pairs within the chromatin persistence length, estimated to comprise 3.5 kb by taking the chromatin linear density to be 50 bp/nm [58]. Thus, typical loops contain several (\(\sim 6\)) persistence lengths of the chromatin polymer. Figure 4C shows the distributions of chromatin backbone segment lengths, i.e., the distributions of lengths of chromatin segments, that lie outside of loops. Again, these distributions are similar for all regions simulated with an overall mean and standard deviation of 28.9 kb and 26.8 kb, respectively, again corresponding to several (\(\sim 8\)) persistence lengths between loops. Since a LEF’s anchors bring the chromosomal loci, bound by the LEF anchors, into spatial proximity, loops lead to a significant linear compaction of the chromatin polymer. Figure 4D shows the distributions of chromatin compaction across an ensemble of loop configurations, defined as the fraction of the chromatin contour length contained within the backbone. These distributions too are similar for all five regions simulated, with overall mean and standard deviation of 0.4161 and 0.0663, implying that the chromatin polymer’s contour length in fission yeast is effectively 2.5-times shorter with loops than without.
Loop configurations and properties. A Snapshots of three representative simulated loop configurations of interphase S. pombe’s 200–1400 kb region of Chr 2. In each case, the chromatin backbone is represented as a straight line, while loops are represented as semicircles connecting loop anchors, following Ref. [61]. Because LEF anchors block each other, loops can nest but they cannot cross. B, C, and D Distributions of loop size, backbone segment length, and chromatin compaction ratio (as measured by the fraction of the chromatin contour length within the backbone), respectively, for the five genomic regions of interphase S. pombe, listed in Table 2: yellow, Chr 1: 0.5–1.7 Mb; purple, Chr 1: 4.2–5.4 Mb; red, Chr 2: 0.2–1.4 Mb; green, Chr 2: 1.8–3.0 Mb; cyan, Chr 3: 1.2–2.4 Mb
Diffusion capture model does not reproduce experimental interphase S. pombe Hi-C maps
Because Hi-C and ChIP-seq both characterize chromatin configuration at a single instant of time, and do not provide any direct time-scale information, an alternative mechanism that somehow generates the same instantaneous loop distributions and loop correlations as loop extrusion would lead to the same Hi-C map as does loop extrusion. This observation has motivated consideration of diffusion capture models in which loops occur and persist as a result of loop capture via binding events [11, 14, 15, 17, 20, 22,23,24, 62, 63]. Such a scenario is well recognized for a number of isolated enhancer-gene loci pairs in, for example, Drosophila melanogaster [64]. A major obstacle to the application of such models across the genome is that there is no physical basis for diffusion capture models to give rise to the approximately exponential loop size distributions, which emerge naturally from the loop extrusion model and well describe Hi-C maps (Fig. 4). Instead, the loops in diffusion capture models can be expected to realize an equilibrium, power-law distribution of loop sizes (Additional File 1: Fig. S15D), corresponding to the return probability of a random walk. To investigate quantitatively how well a physically sensible diffusion capture model can describe Hi-C maps, while remaining consistent with cohesin ChIP-seq, we considered a diffusion capture model in which the probability that a loop connects sites i and j is proportional to
where \(P_i\) is the probability that site i is occupied by cohesin, which we determine from cohesin ChIP-seq data without distinguishing between cohesive and loop-extruding cohesins, and the factor \(|i-j|^{-\frac{3}{2}}\) corresponds to the probability that a unconstrained random walk returns to its starting point after \(i-j\) steps (self-avoidance would change the exponent, but not the power-law behavior).
By using the experimental interphase S. pombe cohesin (Psc3) ChIP-seq data to determine \(P_i\), we have carried out Monte Carlo simulations of this model with only two possible parameters: LEF density and chromatin persistence length, both derived from the CCLE-best-fit values for S. pombe. These simulations yield an ensemble of loop configurations, in turn allowing us to calculate the corresponding Hi-C maps in the same way that we calculate Hi-C maps from the ensemble of loop configurations generated by CCLE simulations. Figure S15A (Additional File 1) compares a portion of the Hi-C map, simulated on the basis of the diffusion capture model, to the corresponding experimental Hi-C. Also illustrated in Figure S15B (Additional File 1) is the ratio map between the diffusion-capture-simulated Hi-C and experiment, showing more darker pixels than the ratio map between the CCLE-simulated Hi-C and the experiment (Fig. 1E). Evidently, the diffusion capture model gives rise to a Hi-C map that provides a much poorer description of the experimental Hi-C map (with an MPR of 1.1754 and a PCC of 0.2675), than does CCLE, largely failing to reproduce the inhomogeneous pattern of squares corresponding to TADs and to match the measured P(s) (Additional File 1: Fig. S15C). This comparison suggests that loop extrusion-based models for chromatin organization should be much preferred over diffusion capture models.
CCLE describes TADs and loop configurations in meiotic S. cerevisiae
CCLE simulations accurately reproduce experimental Hi-C maps of meiotic S. cerevisiae
To further examine the ability of CCLE to describe TAD-scale chromatin organization, we next sought to describe the Hi-C map of meiotic S. cerevisiae from Ref. [65], using the ChIP-seq data of the meiotic cohesin core subunit, Rec8, from Ref. [66]. In contrast to the semi-dilute polymer solution envisioned to describe chromatin in interphase, in meiosis, the chromatin polymer is significantly compacted and is conspicuously organized about the chromosomal axis. Therefore, meiotic chromatin represents a very different polymer state than interphase chromatin, in which to test CCLE. To empirically account for increased polymer volume exclusion as a result of this more compacted polymer state, we scale the P(s) of the simulated Hi-C by a Gaussian scaling factor with standard deviation, \(\sigma\), given in Table 2.
The right-hand side of Fig. 5A depicts a 500-kb portion of the experimental Hi-C map of meiotic S. cerevisiae’s Chr 13 from 290 kb to 790 kb, using a logarithmic false-color intensity scale. In comparison, the left-hand side of Fig. 5A presents the corresponding best-fit simulated Hi-C map. Both maps are shown at 2-kb resolution (the original simulated Hi-C map was at 500-bp resolution and binned to 2 kb to match the experimental resolution). Figure 5B magnifies the experiment-simulation comparison for three representative 80-kb sub-regions. Both Fig. 5A and B reveal similar patterns of high-probability contacts for the experimental and simulated Hi-C maps, manifested in a high-degree of left-right symmetry. However, in contrast to the patch-like TAD patterns featured in the experimental and simulated Hi-C maps of interphase S. pombe, which consist of overlapping squares with more-or-less evenly distributed enhanced contact probability, the TADs of meiotic S. cerevisiae are dominated by discrete lines of high contact probability and their intersection points. Nevertheless, Fig. 5A and B make it clear that in spite of their strikingly different appearances to the TAD patterns of interphase S. pombe, the grid-like patterns of TADs in meiotic S. cerevisiae are also well reproduced by CCLE model. In addition, the three quantities, given by Eqs. 6, 7, and 8, are distributed around zero with relatively small fluctuations (Additional File 1: Fig. S13), indicating that CCLE model is self-consistent in this case also.
Conserved-current loop extrusion (CCLE) model recapitulates TAD-scale chromatin organization in meiotic S. cerevisiae using cohesin ChIP-seq data. A Comparison between the simulated Hi-C map of 290–790 kb region of Chr 13, generated by the CCLE model (using meiotic Rec8 ChIP-seq data from Ref. [66]), and the experimental Hi-C map [65] of the same region. Both Hi-C maps show interactions up to a genomic separation of 64 kb. B Magnified experiment-simulation Hi-C comparisons for three representative sub-regions of 80 kb in size: 412–492 kb, 546–626 kb, and 672–752 kb, from top to bottom. C Contact probability ratio map between the experimental and simulated Hi-C maps in panel A. D Normalized experimental meiotic cohesin (Rec8) occupancy probability (blue) and simulated LEF occupancy probability landscape (red). The occupancy probability curves are plotted for 440–640 kb of Chr 13 and are normalized by the corresponding optimized LEF density of 0.058 kb\(^{-1}\). E Cross-correlation between the experimental meiotic cohesin (Rec8) occupancy landscape and the simulated LEF occupancy probability landscape, as a function of relative genomic shift. F Chromatin mean contact probability, P(s), plotted as a function of genomic separation, s, for the experimental (blue circles) and simulated (red line) Hi-C, with the latter scaled by the Gaussian scaling factor as described in the “Methods” section. The vertical gray dashed line indicates the maximum genomic separation displayed in the Hi-C comparison map in panels A and B. G Snapshots of three representative simulated meiotic loop configurations in the 240–840-kb region of Chr 13. In each case, the chromatin backbone is represented as a straight line, while loops are represented as semicircles connecting loop anchors, following Ref. [61]. Because LEF anchors block each other, loops can nest but they cannot cross, although they frequently come into contact. Simulation results shown in this figure are generated by CCLE using the best-fit parameters given in Table 2. The optimization process is discussed in Additional File 1: Methods
The very different appearances of the Hi-C maps of interphase S. pombe and meiotic S. cerevisiae directly follow from the much greater contrast of meiotic S. cerevisiae’s cohesin ChIP-seq profile, which consists of a dense pattern of strong, narrow peaks, which extend above the background to reach 4 or 5 times background (Fig. 5D). Also different is the best-fit value of the LEF density, which is about twice as large in meiotic S. cerevisiae as in interphase S. pombe (Table 2). However, the best-fit value for the LEF processivity of meiotic S. cerevisiae is similar to that of interphase S. pombe, suggesting loop-extruding cohesins possess similar properties in both species. Values for the LEF density (0.058 kb\(^{-1}\)) and processivity (38.4 kb) may be compared to the best-fit values given in Ref. [54], of 0.03–0.04 kb\(^{-1}\) and 64–76 kb, respectively. The best-fit value of the cohesive cohesin density is noticeably smaller for meiotic S. cerevisiae than for interphase S. pombe. It is also significantly smaller than the best-fit density of loop-extruding cohesins, suggesting that the preponderance of cellular cohesins are involved in loop extrusion in meiotic S. cerevisiae. The best-fit value of persistence length of meiotic chromatin in S. cerevisiae is about twice that of interphase chromatin in S. pombe, potentially indicating stiffer chromatin due to a more compact chromatin state.
In spite of the agreement between simulation and experiment, evident in Fig. 5A and B, the experiment-simulation comparison for meiotic S. cerevisiae shows a higher MPR and lower PCC than for interphase S. pombe (Table 1), indicating a poorer fit. However, we ascribe this poorer fit, at least in part, to the larger experimental errors of the former dataset. These larger experimental errors are apparent when we calculate the MPR and the PCC for the two duplicate Hi-C datasets available in Ref. [65], which take values of 1.5714 and 0.8180, respectively. The MPR score reveals greater discrepancies between the two nominally identical meiotic S. cerevisiae datasets than between the experiment and the best-fit simulation. Furthermore, the PCC score for the comparison between two nominally identical datasets is not significantly higher than that for the experiment-simulation comparison.
CCLE self-consistently reproduces Rec8 ChIP-seq data in meiotic S. cerevisiae
For a typical 200-kb-sized region of Chr 13 of S. cerevisiae, Fig. 5D compares the simulated time- and population-averaged probability that a chromatin lattice site is occupied by a LEF (red curve) to the corresponding experimental ChIP-seq data for Rec8 [66] (blue curve), converted to occupancy probability, as described in the “Methods” section. Clearly, the simulated LEF occupancy probability matches the experimental Rec8 occupancy probability well, with a cross-correlation value of almost 0.8 (Fig. 5E).
Loop configurations in meiotic S. cerevisiae
Figure 5G shows three representative simulated loop configurations for the 240–840 kb region of Chr 13, corresponding to the best-fit parameters. In comparison to the loop configurations in interphase S. pombe, the loops in meiotic S. cerevisiae appear more regularly spaced, corresponding to the more regularly distributed peaks of meiotic S. cerevisiae’s cohesin ChIP-seq data. Figures S16A and B (Additional File 1) present the distributions of loop sizes and backbone segment lengths, respectively, for the same region in S. cerevisiae. The mean and standard deviation of these quantities are 19.82 and 16.06 kb (mean) and 15.97 and 14.67 kb (SD), respectively. Figure S16C (Additional File 1) shows the distributions of chromatin relative compaction, whose mean and standard deviation are 0.2129 and 0.0702, i.e., the chromatin polymer’s contour length in meiotic S. cerevisiae is effectively 5 times shorter with loops than without, twice as compact as the interphase S. pombe’s chromatin. Motivated by the experimental observation that the inactivation of “cohesin release factor,” Wpl1, gives rise to larger loops [41, 42, 67, 68],we performed CCLE simulations of wild-type and Wpl1-depleted meiotic S. cerevisiae using the corresponding meiotic ChIP-seq data of Smc3 [69]. Indeed, we observe enhanced loop sizes in Wpl1-depleted cells with a mean loop size that is two and a half times larger than in wild-type cells (Additional File 1: Fig. S17).
CCLE describes TADs and loop configurations in mitotic S. cerevisiae
Next, we applied CCLE to the Hi-C map of mitotic S. cerevisiae using the mitotic ChIP-seq data for the cohesin core protein, Mcd1, from Ref. [70]. The right hand side of Fig. 6A displays the Hi-C map of the 250–350-kb region of mitotic S. cerevisiae’s chromosome 10 at 500-bp resolution, reproducing the Hi-C data shown in Fig. 2A of Ref. [71], up to a genomic separation of 45 kb. All else being equal, as a result of the 500-bp-resolution, there are a reduced number of counts in each genomic pixel, compared to Hi-C maps displayed at lower resolutions. It follows that this experimental contact map is relatively noisy compared to both interphase S. pombe and meiotic S. cerevisiae. Nevertheless, standing above a field of relatively weak contacts, it is apparent that mitotic S. cerevisiae’s Hi-C map is primarily characterized by the presence of a number of prominent, isolated points of high-probability contacts―often called “puncta.” Unsurprisingly, mitotic S. cerevisiae’s Hi-C map is very different from that of interphase S. pombe (Fig. 1). However, it also appears distinct from the Hi-C map of meiotic S. cerevisiae (Fig. 5), in spite of the fact that the cohesin ChIP-seq of mitotic S. cerevisiae shows the same peak locations as the cohesin ChIP-seq of meiotic S. cerevisiae, as illustrated in Figure S18A (Additional File 1). Importantly, however, the cohesin ChIP-seq peaks are higher and narrower in the mitotic case than in the meiotic case, while the ChIP-seq signal between the peaks is suppressed in the mitotic case relative to the meiotic case. The lack of cohesin background signal in the mitotic case likely suggests that the mitotic cohesive cohesin density is very low.
Conserved-current loop extrusion (CCLE) model recapitulates TAD-scale chromatin organization in mitotic S. cerevisiae using cohesin ChIP-seq data. A Comparison between the Hi-C map of 250–350 kb of Chr 10, generated by the CCLE model (using mitotic Mcd1 ChIP-seq data from Ref. [70]), and the experimental Hi-C map [71] of the same region. Both Hi-C maps show interactions up to a genomic separation of 45 kb. B Contact probability ratio map between the experimental and simulated Hi-C in panel A. C Normalized experimental mitotic cohesin (Mcd1) occupancy probability (blue) and simulated LEF occupancy probability landscape (red). The occupancy probability curves are normalized by the corresponding optimized LEF density of 0.033 kb\(^{-1}\). D Cross-correlation between the experimental mitotic cohesin (Mcd1) occupancy landscape and the simulated LEF occupancy probability landscape, as a function of relative genomic shift. E Chromatin mean contact probability, P(s), plotted as a function of genomic separation, s, for the experimental (blue circles) and simulated (red line) Hi-C , scaled by the Gaussian correction factor as described in the “Methods” section. The vertical gray dashed line indicates the maximum genomic separation displayed in the Hi-C comparison map in panel A. F Snapshots of three representative simulated mitotic loop configurations in the 100–700 kb region of Chr 10. In each case, the chromatin backbone is represented as a straight line, while loops are represented as semicircles connecting loop anchors, following Ref. [61]. Because LEF anchors block each other, loops can nest but they cannot cross, although they frequently come into contact. Simulation results shown in this figure are generated by CCLE using the best-fit parameters given in Table 2. The optimization process is discussed in Additional File 1: Methods
In comparison with the experimental Hi-C map on the right-hand side Fig. 6A, the left-hand side of Fig. 6A shows the corresponding best-fit simulated Hi-C map generated by the CCLE model. The CCLE-simulated Hi-C map shows puncta that well match the experimental puncta, demonstrating that CCLE is also able to well-describe chromatin looping in mitotic S. cerevisiae. In spite of the agreement between simulation and experiment, the experiment-simulation comparison for mitotic S. cerevisiae reveals a higher MPR and lower PCC compared to both interphase S. pombe and meiotic S. cerevisiae, which may indicate a poorer fit or noisier experimental data (Table 1). Given the noisier nature of the Hi-C data at this higher resolution, we ascribe the lower MPR and PCC scores to the latter case. Figure 6B shows the contact probability ratio map between the experimental and simulated Hi-C maps from which the noise present in these data is further apparent. It is important to note that the poorer MPR and PCC scores could also arise from the kilobase-scale features that originate from nucleosomes in high-resolution Hi-C maps, as CCLE does not describe nucleosome-level organization.
Interestingly, except the Gaussian scaling factors, the best-fit values of all other model parameters for mitotic S. cerevisiae (Table 2) are quite different from those for meiotic S. cerevisiae. Specifically, the LEF density is about half of the value observed in meiotic S. cerevisiae, the LEF processivity for mitotic S. cerevisiae is only about 20% of the value observed in meiotic S. cerevisiae, and the persistence length is only about 40%. It is important to realize that, since the persistence length and the Gaussian scaling factor primarily determine the long-range genomic contact, altering one of these parameters is likely to affect the best-fit value of the other. To reflect the lack of background signal in the experimental mitotic cohesin ChIP-seq data, we set the value of cohesive cohesin density to zero (Table 2). The circumstances of fewer and less extended cohesin loops in the mitotic case, compared to the meiotic case, are reflected in the sparser and smaller loops in Fig. 6F, compared to Fig. 5G. Thus, according to the CCLE model, the different appearances of the Hi-C maps of meiotic and mitotic S. cerevisiae originate from the differences in their ChIP-seq profiles, as well as variations in cohesin density and processivity between the two cases.
Figure 6C displays the normalized experimental mitotic cohesin occupancy probability derived from Mcd1 ChIP-seq [70] (blue line), versus LEF occupancy probability from the CCLE simulation (red line). Both landscapes appear very similar, with strong peaks at the same genomic locations (although the experimental peaks are higher), and they exhibit a very strong cross-correlation (Fig. 6D). As shown in Fig. 6E, the experimental chromatin mean contact probability, P(s), as a function of genomic separation, s, is also well reproduced by the CCLE model.
However, in comparison to the previous two examples, the three quantities, given by Eqs. 6, 7, and 8, have broader distributions and larger standard deviations around the mean values, which are nevertheless close to zero (Additional File 1: Fig. S14). Interestingly, lattices with a greater imbalance of left- and right-moving LEFs are found on both sides of cohesin peaks. Indeed, Figure S14C (Additional File 1) presents the cross-correlation between the cohesin ChIP-seq and the imbalance landscape, where the trough-peak pattern around zero suggests that there are more right-moving LEF anchors approaching the left sides of cohesin peaks and more left-moving LEF anchors approaching the right sides of cohesin peaks. Furthermore, Figures S14F and I (Additional File 1), which present the cross-correlations between cohesin ChIP-seq and current violation landscapes, show significant troughs at zero, suggesting that LEF unbinding events outnumber the binding events at cohesin peaks. We ascribe these greater violations of CCLE’s assumptions at the locations of cohesin peaks in part to the low processivity of mitotic cohesin in S. cerevisiae, compared to that of meiotic S. cerevisiae and interphase S. pombe, and in part to the low CCLE loop extrusion rate at the cohesin peaks. While CCLE assumptions are violated at these sites, the model (1) does allow for these approximations to be modeled, (2) provides internal metrics to check for violations, and (3) still recovers chromatin contact distribution correctly (based on Hi-C comparison). In the future, we plan to develop an improved version of CCLE that will self-consistently account for binding and unbinding, as well as imbalance of left- and right-moving LEFs, as follows: (1) from the previous best-fit simulation results, evaluate the empirical binding/unbinding rates and left/right-moving LEF imbalance for each lattice position, (2) update and solve the exact master equations, Eqs. 1 and 2, for the position-dependent loop extrusion rates using the empirical binding/unbinding rates and left/right-moving LEF proportions, (3) use the new loop extrusion rates to run CCLE simulations and obtain a new set of best-fit simulation results, (4) iterate until the simulation parameters converge.
Discussion
In vertebrates, CTCF defines the locations of most TAD boundaries. It is interesting to ask what might play that role in interphase S. pombe as well as in meiotic and mitotic S. cerevisiae. A number of papers have suggested that convergent gene pairs are correlated with cohesin ChIP-seq in both S. pombe [72, 73] and S. cerevisiae [68, 73,74,75,76,77]. Because CCLE ties TADs to cohesin ChIP-seq, a strong correlation between cohesin ChIP-seq and convergent gene pairs would be an important clue to the mechanism of TAD formation in yeasts. To investigate this correlation, we introduce a convergent-gene variable that has a non-zero value between convergent genes (determined using gene annotations from Saccharomyces Genome Database [78]) and an integrated weight of unity for each convergent gene pair. Figure S18A (Additional File 1) shows the convergent gene variable, so-defined, alongside the corresponding cohesin ChIP-seq for meiotic and mitotic S. cerevisiae. It is apparent from this figure that a peak in the ChIP-seq data is accompanied by a non-zero value of the convergent-gene variable in about 80% of cases, suggesting that chromatin looping in meiotic and mitotic S. cerevisiae may indeed be tied to convergent genes. Conversely, about 50% of convergent genes match peaks in cohesin ChIP-seq. The cross-correlation between the convergent-gene variable and the ChIP-seq of meiotic and mitotic S. cerevisiae is quantified in Figs. S18B and C (Additional File 1). By contrast, in interphase S. pombe, cross-correlation between convergent gene locations (determined using gene annotations from PomBase [79]) and cohesin ChIP-seq in each of five considered regions is unobservably small (Additional File 1: Fig. S19), suggesting that convergent genes per se do not have a role in defining TAD boundaries in interphase S. pombe.
Although “bottom-up” models which incorporate explicit boundary elements do not exist for non-vertebrate eukaryotes, one may wonder how well such LEF models, if properly modified and applied, would perform in describing Hi-C maps with diverse features. To this end, we examined the performance of the model described in Ref. [54] in describing the Hi-C map of interphase S. cerevisiae. Reference [54] uses cohesin ChIP-seq peaks in meiotic S. cerevisiae to define the positions of loop extrusion barriers which either completely stall an encountering LEF anchor with a certain probability or let it pass. We apply this “explicit barrier” model to interphase S. pombe, using its cohesin ChIP-seq peaks to define the positions of loop extrusion barriers, and using Ref. [54]’s best-fit value of 0.05 for the pass-through probability. Although the applicability of a pass-through probability of 0.05, derived from meiotic S. cerevisiae, to interphase S. pombe is uncertain, in fact, simulations reveal that varying this quantity across the range from 0.005 to 0.5 causes only modest changes in the corresponding simulated Hi-C maps, as illustrated in Figs. S20E and F (Additional File 1). Figure S20A (Additional File 1) presents the simulated Hi-C map of the 0.3–1.3 kb region of Chr 2 of interphase S. pombe in comparison with the corresponding Hi-C data. It is evident that the explicit barrier model provides a poorer description of the Hi-C data of interphase S. pombe compared to the CCLE model, as indicated by the MPR and PCC scores of 1.6489 and 0.2267, respectively. While the explicit barrier model appears capable of accurately reproducing Hi-C data with punctate patterns, typically accompanied by strong peaks in the corresponding cohesin ChIP-seq, it seems less effective in cases such as in interphase S. pombe, where the Hi-C data lacks punctate patterns and sharp TAD boundaries, and the corresponding cohesin ChIP-seq shows low-contrast peaks. The success of the CCLE model in describing the Hi-C data of both S. pombe and S. cerevisiae, which exhibit very different features, suggests that the current paradigm of localized, well-defined boundary elements may not be the only approach to understanding loop extrusion. By contrast, CCLE allows for a concept of continuous distribution of position-dependent loop extrusion rates, arising from the aggregate effect of multiple interactions between loop extrusion complexes and chromatin. This paradigm offers greater flexibility in recapitulating diverse features in Hi-C data than strictly localized loop extrusion barriers.
In our current CCLE implementation, cohesin binds to chromatin at random locations. However, the fission yeast protein, Mis4, has been previously identified, as a component of the cohesin loading complex [80,81,82], prompting us to envision a modification of the model to incorporate position-dependent cohesin binding with a binding rate proportional to the Mis4 ChIP-seq signal. However, for S. pombe, the effect of such a modification on the resultant loop configurations must necessarily be small, because cohesin distribution essentially defines the steady-state distribution of loop anchors but there is no correlation between the Mis4 [83] and Psc3 [55] ChIP-seq signals (Additional File 1: Fig. S21A). Therefore, the overall distribution of cohesin along the genome (given by the Psc3 ChIP-seq) is independent of where the cohesin was putatively loaded (specified by the Mis4 ChIP-seq). This observation suggests that, in S. pombe at least, the collective spreading of cohesins following association to chromatin is sufficiently large, so as to obscure their initial positions.
As noted above, the input for our CCLE simulations of chromatin organization in S. pombe was the ChIP-seq of Psc3, which is a component of the cohesin core complex [84]. Accordingly, Psc3 ChIP-seq represents how the cohesin core complex is distributed along the genome. In S. pombe, the other components of the cohesin core complex are Psm1, Psm3, and Rad21. Because these proteins are components of the cohesin core complex, we expect that the ChIP-seq of any of these proteins would closely match the ChIP-seq of Psc3 and would equally well serve as input for CCLE simulations of S. pombe genome organization. Figure S21C (Additional File 1) confirms significant correlations between the ChIP-seq of Psc3 [55] and Rad21 [60]. In light of this observation, we then reason that the CCLE approach offers the opportunity to investigate whether other proteins beyond the cohesin core are constitutive components of the loop extrusion complex during the extrusion process (as opposed to only loading or unloading). To elaborate, if the ChIP-seq of a non-cohesin-core protein is highly correlated with the ChIP-seq of a cohesin core protein, we can infer that the protein in question is associated with the cohesin core and therefore is a likely participant in loop-extruding cohesin, alongside the cohesin core. Conversely, if the ChIP-seq of a putative component of the loop-extruding cohesin complex is uncorrelated with the ChIP-seq of a cohesin core protein, then we can infer that the protein in question is unlikely to be a component of loop-extruding cohesin or at most is transiently associated with it.
For example, in S. pombe, the ChIP-seq of the cohesin regulatory protein, Pds5 [83], is correlated with the ChIP-seq of Psc3 [55] (Additional File 1: Fig. S21B) and with that of Rad21 [60] (Additional File 1: Fig. S21D), suggesting that Pds5 can be involved in loop-extruding cohesin in S. pombe, alongside the cohesin core proteins. Similar correlation between the ChIP-seq of Mcd1 [70] and Pds5 [85] is also found in S. cerevisiae (Additional File 1: Fig. S22B). Interestingly, this inference concerning the fission yeast and budding yeast cohesin subunit, Pds5, stands in contrast to the conclusion from a recent single-molecule study [43] concerning cohesin in vertebrates. Specifically, Reference [43] found that cohesin complexes containing Pds5, instead of Nipbl, are unable to extrude loops. Other studies also found that Pds5 restricts DNA loop extrusion by cohesin in both budding yeast [42, 68] and vertebrates [41]. In light of these findings, an attractive explanation for the observed correlations between the ChIP-seq of Rad21/Mcd1 and Pds5 in S. pombe and S. cerevisiae is that Pds5 could act like a boundary element that stalls cohesin complexes upon contact, thus achieving colocalization with cohesin.
Additionally, as noted above, in interphase S. pombe, the ChIP-seq signal of the cohesin loader, Mis4, is uncorrelated with the Psc3 ChIP-seq signal in non-centromeric regions (Additional File 1: Fig. S21A), suggesting that Mis4 is, at most, a very transient component of cohesin in S. pombe. Such a correlation between the ChIP-seq of Scc2 [86] (counterpart of S. pombe’s Mis4) and the cohesin core (Mcd1 [70]) is also lacking in mitotic S. cerevisiae (Additional File 1: Fig. S23). However, Reference [87] found that, in addition to its role as a cohesin loader, Scc2 drives expansion of DNA loops in vivo in mitotic S. cerevisiae. Assuming that ChIP-seq data correctly reflects cohesin distribution, the absence of correlation between the ChIP-seq of Scc2 and cohesin core in mitotic S. cerevisiae suggests that the activity of Scc2 in driving DNA loop expansion involves a mechanism other than persistent colocalization or co-translocation with the cohesin core. However, this observation does not rule out the possibility that Scc2 directly interacts with the cohesin core. Indeed, as a commonly identified cohesin loading complex, Scc2 could transiently colocalize with the cohesin core as loading takes place. In contrast to the lack of correlation between the ChIP-seq of Mis4/Scc2 and cohesin in interphase S. pombe and mitotic S. cerevisiae, there are significant correlations between the ChIP-seq of Nipbl [88] (counterpart of Mis4/Scc2) and the cohesin core protein, Smc1 [88], in humans (Additional File 1: Fig. S21G). Unsurprisingly, both References [43] and [44] found that Nipbl is an obligate component of the loop-extruding human cohesin complex, in addition to its role as a cohesin loader. Although CCLE has not yet been applied to vertebrates, from a CCLE perspective, the possibility that Nipbl may be required for the loop extrusion process in humans is bolstered by the significant correlations between the ChIP-seq of human Nipbl and the cohesin core (Additional File 1: Fig. S21G), consistent with Ref. [32]’s hypothesis that Nipbl is involved in loop-extruding cohesin in vertebrates. A recent theoretical model for the molecular mechanism of loop extrusion by cohesin hypothesizes that transient binding by Mis4/Scc2/Nipbl is essential for permitting directional reversals and therefore for two-sided loop extrusion [46]. Surprisingly, there are significant correlations between Mis4 and Pds5 in S. pombe (Additional File 1: Fig. S21E), indicating Pds5-Mis4 association, outside of the cohesin core complex; however, similar correlations are lacking between Scc2 and Pds5 in S. cerevisiae (Additional File 1: Fig. S22).
Beyond yeast, because the CCLE model is agnostic about the identity of any particular boundary element, in principle it extends the LEF model to organisms across the tree of life, including organisms that do not express CTCF. By contrast, prior LEF models have been overwhelmingly limited to vertebrates, which express CTCF and where CTCF is the principal boundary element. Two exceptions, in which the LEF model was applied to non-vertebrates, are Ref. [54], discussed above, and Ref. [89], which models the Hi-C map of the prokaryote, Bacillus subtilis, on the basis of condensin loop extrusion with gene-dependent barriers. In future work, it will be interesting to explore the applicability of CCLE model to other model organisms, that exhibit TADs, including, for example, Drosophila melanogaster, Arabidopsis thaliana, Oryza sativa, Caenorhabditis elegans, and Caulobacter crescentus, as well as to vertebrates.
Conclusions
By examining an approximate steady-state solution of master equations describing the motions of chromatin loop extruding complexes, we have been led to a new version of the loop extrusion factor model―the conserved-current loop extrusion (CCLE) model―that does not require the input of genomic positions of boundary elements and uses cohesin ChIP-seq data as the sole input. To demonstrate its utility, we applied the CCLE model to accurately reproduce the TAD-scale (\(\sim 10-100\) kb) Hi-C maps for each of interphase S. pombe and mitotic S. cerevisiae, for the first time, and of meiotic S. cerevisiae, in effect recapitulating the results of Ref. [54] but using a model with an improved physical basis. Importantly, the fact that the CCLE model can convert cohesin ChIP-seq data into an accurate Hi-C map highlights that essential aspects of the three-dimensional chromatin configuration are encoded in the one-dimensional cohesin distribution and strongly suggests that the loop configurations generated, as well as the best-fit values of the density and processivity of loop-extruding cohesins, are realistic. Not limited to cohesin, other SMC complexes, such as condensin [59, 90,91,92] and Smc5/6 [93], could play similar roles in other organisms or different stages of the cell cycle. The CCLE model is agnostic to how SMC distributions are established, providing greater flexibility to account for different LEF-chromatin interactions, such as loop extrusion barriers blocking LEFs [27, 94, 95] and arrays of factors slowing down LEF translocation [94,95,96]. Overall, our results provide compelling evidence that chromatin organization at the TAD scale in interphase S. pombe, as well as in meiotic and mitotic S. cerevisiae, is primarily the result of loop extrusion and that the cohesin complex is the dominant loop extrusion factor, marking the base of all, or the overwhelming majority, of TAD-scale loops in these systems.
Methods
Gillespie-type loop extrusion simulations
For interphase S. pombe (meiotic and mitotic S. cerevisiae), we represent 1.2 Mb (600 kb) regions of the chromatin polymer as an array of 1200 discrete lattice sites, each comprising 1000 bp (500 bp), and simulate chromatin loop configurations using a Gillespie-type algorithm applied to the LEFs, populating these lattice sites, similarly to Refs. [27, 29]. Because the simulations are of Gillespie-type, the time separations between successive events are exponentially distributed, while which event is realized occurs with a probability that is proportional to the rate of that event. LEFs are modeled as objects with two anchors, which initially bind to empty, adjacent chromatin lattice sites with a binding probability, that is uniformly distributed across the genome. Upon binding, a LEF’s anchors translocate stochastically and independently in opposite directions at rates specified by position-dependent loop extrusion rates determined from experimental cohesin ChIP-seq data. Only outward steps, that grow loops, are permitted. Translocation of a LEF anchor into a lattice site that is already occupied by another LEF’s anchor, is forbidden. LEFs dissociate at a constant rate, dissipating the corresponding loop. In our simulations, after unbinding, a LEF immediately rebinds to an empty pair of neighboring sites, maintaining a constant number of bound LEFs. Neither simulated contact probabilities nor simulated mean occupancy probabilities depend on the time scale of loop extrusion. Therefore, the results of our simulations for chromatin-chromatin contacts and LEF occupancies depend on the mean LEF processivity, which is the ratio of the mean LEF extrusion rate and the dissociation rate, and which therefore is the appropriate fitting parameter. In other words, the LEF dissociation rate (inverse of lifetime) can be arbitrary so long as the processivity remains unchanged by adjusting the extrusion rate accordingly. In practice, however, we set the LEF dissociation rate to \(5\times 10^{-4}\) time-unit\(^{-1}\) (equivalent to a lifetime of 2000 time-units), and the nominal LEF extrusion rate (aka \(\rho L/\tau\), see Additional File 1: Methods) can be determined from the given processivity.
ChIP-seq to occupancy probability
The S. pombe (S. cerevisiae) experimental cohesin ChIP-seq data is averaged over each 1000-bp-sized (500-bp-sized) lattice site to yield the ChIP-seq signal, \(C_n\), for each chromatin lattice site, n. To convert Psc3 ChIP-seq data at lattice site n, \(C_n\), to the occupancy probability, \(P_n\), we write
where \(\rho _{c}\) is the cohesive cohesin density (number per chromatin lattice site), N is the total number of lattice sites, and \(N_{LEF}\) is the number of LEFs simulated in the region represented by N lattice sites. The factor of 2 is because one LEF has two anchors and occupies two lattice sites. Although we call \(\rho _c\) “cohesive cohesin density,” it is important to note that any experimental backgrounds or shifts in the published ChIP-seq data are also subsumed into this parameter. The excellent experiment-simulation agreement justifies the assumption of a uniform distribution of cohesive cohesins a posteriori.
Modeling the chromatin polymer inside the nucleus
To model chromatin self-contacts, we treat chromatin inside the cell nucleus as a Gaussian polymer in spherical confinement. For a Gaussian polymer in the continuous limit, the probability density, \(p(r,\theta ,\phi ,n)\), that a genomic locus n along the polymer is positioned at coordinates (\(r,\theta ,\phi\)) follows the diffusion equation, \({\partial p}/{\partial n} = (l^2/6) \nabla ^2 p\), where l is the Kuhn length, which is twice the persistence length. Spherical confinement within a nuclear radius a can be enforced by finding solutions to the diffusion equation either with reflecting or absorbing boundary conditions at \(r=a\). As discussed in Refs. [97,98,99], reflecting boundary conditions correspond to an attractive surface-polymer interaction given by \(\epsilon = -k_BT\log {(6/5)}\), while absorbing boundary conditions correspond to zero polymer-surface interaction. Comparison of chromatin mean contact probability, P(s), using reflecting and absorbing boundary conditions is presented in Fig. S10 (Additional File 1). To crudely account for “Rabl configurations,” in which chromatin is known to be attached to the nuclear envelope [100, 101], we chose to use the solution for reflecting boundary conditions to calculate the self-contact probability between any two points along the chromatin polymer. Calculation of the corresponding self-contact probability for two locations with genomic separation, n, is presented in detail in Additional File 1: Methods. A similar method is also adopted in Ref. [102]. To then incorporate loops, we replace the actual genomic separation (n) between the two loci of interest by their effective genomic separation (\(n_{eff}\)), which is also discussed in detail in Additional File 1: Methods [103]. In brief, \(n_{eff}\) is the backbone length between the two locations of interest modified by the contribution of any loops that contain the two locations. Our approximate, analytic approach to polymer self-contacts may be justified a posteriori on the basis of the excellent data-simulation agreement apparent in Figs. 1 and 2.
Because of the increased compaction of meiotic and mitotic chromosomes, compared to interphase chromosomes, we can expect volume exclusion to play a more prominent role in meiotic and mitotic nuclei than in interphase nuclei. In order to account for volume exclusion in meiotic and mitotic S. cerevisiae, we introduce an additional empirical factor, \(e^{-s^2/{2\sigma ^2}}\), which reduces the probability of contacts with large genomic separations. This factor is characterized by an additional fitting parameter, namely \(\sigma\), which represents a genomic distance scale, beyond which chromatin-chromatin contacts are reduced because of volume exclusion. Inclusion of this factor leads to improved agreement between experimental and simulated P(s) curves at longer length scales (\(\gtrsim\) 50 kb) (Figs. 5F and 6E) but has little effect on simulated Hi-C maps at shorter length scales (\(\lesssim\) 50 kb).
Simulated Hi-C contact map generation
To generate simulated Hi-C contact maps, for each genomic region and each set of parameter values, we performed 200 independent loop extrusion simulations, each comprised of 100000 LEF events. In our simulations, loops achieve a dynamic steady state well within 15000 LEF events, as gauged by convergence of the radius of gyration of chromatin to its steady-state value [104] (Additional File 1: Fig. S24A). We also examined two loop size distributions from two different simulation periods: each distribution consists of 1000 data points, equally separated in time, one between LEF event 15000 and 35000, and the other between LEF event 80000 and 100000. The two distributions are within-errors identical (Additional File 1: Fig. S24B), suggesting that the loop extrusion steady state is well achieved within 15000 LEF events. To ensure that only steady-state loop configurations are included in our simulated Hi-C maps, we discard the first 15000 LEF events. First, to generate a simulated Hi-C contact map for a given loop configuration, we calculate the contact probability for each pair of lattice sites in the given genomic region, using their effective genomic distance, \(n_{eff}\), and the functional form of self-contact probability for a polymer inside a sphere (Additional File 1: Methods, Eq. S36). Then, for each of the 200 independent loop extrusion simulation, we create a time-averaged Hi-C map by averaging together 100 simulated Hi-C contact maps, corresponding to 100 simulated loop configurations that are evenly distributed in time-unit across the simulation time period after the initial 15000 events of non-steady-state period. Finally, we average together the time-averaged Hi-C maps from all 200 simulations to produce an ensemble-averaged Hi-C map. Because the CCLE simulations are performed using a finer resolution than the resolution of experimental Hi-C maps, our ensemble-averaged Hi-C contact maps are binned (block-averaged) to match the resolution of the corresponding experimental Hi-C. The simulated Hi-C maps of interphase S. pombe and meiotic S. cerevisiae are scaled so that the mean contact probability of each simulated Hi-C map along its second diagonal is equal to the mean contact probability along the second diagonal of the corresponding experimental Hi-C map, while the simulated Hi-C map of mitotic S. cerevisiae is scaled so that the mean contact probabilities of the simulated and the corresponding experimental Hi-C maps along the third diagonal are equal. In other words, we require \(P_{\text {sim}}(20~kb) = P_{\text {exp}}(20~kb)\) for interphase S. pombe, \(P_{\text {sim}}(4~kb) = P_{\text {exp}}(4~kb)\) for meiotic S. cerevisiae, and \(P_{\text {sim}}(1.5~kb) = P_{\text {exp}}(1.5~kb)\) for mitotic S. cerevisiae. The simulated Hi-C maps so-obtained are compared to experimental maps in Figs. 1, 5, 6 and Additional File 1: Figs. S1–4.
To assess the noise within our simulated Hi-C maps, we also calculated the MPR and the PCC between the averages of two sets of 200 independent, time-averaged Hi-C simulations, for each case of S. pombe, meiotic S. cerevisiae and mitotic S. cerevisiae, giving MPR values of 1.0149, 1.0179, and 1.0121, and PCC values of 0.9834, 0.9924, and 0.9988, respectively. Evidently, these values are close to unity, indicating that our final simulated maps in all presented cases accurately represent the CCLE model with little noise.
Data availability
Codes used for CCLE simulation and data analysis of ChIP-seq and Hi-C maps are freely and publicly available at https://github.com/bigpaul97/CCLE [108] and https://doiorg.publicaciones.saludcastillayleon.es/10.5281/zenodo.13901225 [109] with the MIT License. All studied experimental data are included and cited in the article. Additional File 2: Table S2 contains detailed information of all experimental datasets used.
References
Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, Sandstrom R, Bernstein B, Bender MA, Groudine M, Gnirke A, Stamatoyannopoulos J, Mirny LA, Lander ES, Dekker J. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009;326(5950):289–93.
van Berkum NL, Lieberman-Aiden E, Williams L, Imakaev M, Gnirke A, Mirny LA, Dekker J, Lander ES. Hi-C: a method to study the three-dimensional architecture of genomes. J Visualized Exp JoVE. 2010;(39):e1869.
Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, Ren B. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485(7398):376–80.
Dixon JR, Gorkin DU, Ren B. Chromatin domains: the unit of chromosome organization. Mol Cell. 2016;62(5):668–80.
Sexton T, Yaffe E, Kenigsberg E, Bantignies F, Leblanc B, Hoichman M, Parrinello H, Tanay A, Cavalli G. Three-dimensional folding and functional organization principles of the Drosophila genome. Cell. 2012;148(3):458–72.
Mizuguchi T, Fudenberg G, Mehta S, Belton J-M, Taneja N, Folco HD, FitzGerald P, Dekker J, Mirny L, Barrowman J, Grewal SIS. Cohesin-dependent globules and heterochromatin shape 3D genome architecture in S. pombe. Nature. 2014;516(7531):432–5.
Dekker J. Two ways to fold the genome during the cell cycle: insights obtained with chromosome conformation capture. Epigenetics Chromatin. 2014;7(1):1–12.
Dekker J, Heard E. Structural and functional diversity of topologically associating domains. FEBS Lett. 2015;589(20):2877–84.
Pollard TD, Earnshaw WC, Lippincott-Schwartz J, Johnson G. Cell biology E-book. Philadelphia: Elsevier Health Sciences; 2016.
Jerkovic I, Cavalli G. Understanding 3D genome organization by multidisciplinary methods. Nat Rev Mol Cell Biol. 2021;22(8):511–28.
Barbieri M, Chotalia M, Fraser J, Lavitas L-M, Dostie J, Pombo A, Nicodemi M. Complexity of chromatin folding is captured by the strings and binders switch model. Proc Natl Acad Sci. 2012;109(40):16173–8.
Giorgetti L, Galupa R, Nora EP, Piolot T, Lam F, Dekker J, Tiana G, Heard E. Predictive polymer modeling reveals coupled fluctuations in chromosome conformation and transcription. Cell. 2014;157(4):950–63.
Barbieri M, Xie SQ, Torlai Triglia E, Chiariello AM, Bianco S, de Santiago I, Branco MR, Rueda D, Nicodemi M, Pombo A. Active and poised promoter states drive folding of the extended HoxB locus in mouse embryonic stem cells. Nat Struct Mol Biol. 2017;24(6):515–24.
Brackley CA, Brown JM, Waithe D, Babbs C, Davies J, Hughes JR, Buckle VJ, Marenduzzo D. Predicting the three-dimensional folding of cis-regulatory regions in mammalian genomes using bioinformatic data and polymer models. Genome Biol. 2016;17(1):59.
Brackley CA, Johnson J, Kelly S, Cook PR, Marenduzzo D. Simulated binding of transcription factors to active and inactive regions folds human chromosomes into loops, rosettes and topological domains. Nucleic Acids Res. 2016;44(8):3503–12.
Di Pierro M, Zhang B, Aiden EL, Wolynes PG, Onuchic JN. Transferable model for chromosome architecture. Proc Natl Acad Sci. 2016;113(43):12168–73.
Chiariello AM, Annunziatella C, Bianco S, Esposito A, Nicodemi M. Polymer physics of chromosome large-scale 3D organisation. Sci Rep. 2016;6(1):29775.
Di Pierro M, Cheng RR, Lieberman Aiden E, Wolynes PG, Onuchic JN. De novo prediction of human chromosome structures: epigenetic marking patterns encode genome architecture. Proc Natl Acad Sci. 2017;114(46):12126–31.
Brackley C, Johnson J, Michieletto D, Morozov A, Nicodemi M, Cook P, Marenduzzo D. Extrusion without a motor: a new take on the loop extrusion model of genome organization. Nucleus. 2018;9(1):95–103.
Bianco S, Lupiáñez DG, Chiariello AM, Annunziatella C, Kraft K, Schöpflin R, Wittler L, Andrey G, Vingron M, Pombo A, Mundlos S, Nicodemi M. Polymer physics predicts the effects of structural variants on chromatin architecture. Nat Genet. 2018;50(5):662–7.
Buckle A, Brackley CA, Boyle S, Marenduzzo D, Gilbert N. Polymer simulations of heteromorphic chromatin predict the 3D folding of complex genomic loci. Mol Cell. 2018;72(4):786–97.
Conte M, Fiorillo L, Bianco S, Chiariello AM, Esposito A, Nicodemi M. Polymer physics indicates chromatin folding variability across single-cells results from state degeneracy in phase separation. Nat Commun. 2020;11(1):3289.
Gerguri T, Fu X, Kakui Y, Khatri BS, Barrington C, Bates PA, Uhlmann F. Comparison of loop extrusion and diffusion capture as mitotic chromosome formation pathways in fission yeast. Nucleic Acids Res. 2021;49(3):1294–312.
Conte M, Irani E, Chiariello AM, Abraham A, Bianco S, Esposito A, Nicodemi M. Loop-extrusion and polymer phase-separation can co-exist at the single-molecule level to shape chromatin folding. Nat Commun. 2022;13(1):4070.
Alipour E, Marko JF. Self-organization of domain structures by DNA-loop-extruding enzymes. Nucleic Acids Res. 2012;40(22):11202–12.
Sanborn AL, Rao SSP, Huang S-C, Durand NC, Huntley MH, Jewett AI, Bochkov ID, Chinnappan D, Cutkosky A, Li J, Geeting KP, Gnirke A, Melnikov A, McKenna D, Stamenova EK, Lander ES, Aiden EL. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc Natl Acad Sci. 2015;112(47):E6456–65.
Fudenberg G, Imakaev M, Lu C, Goloborodko A, Abdennur N, Mirny LA. Formation of chromosomal domains by loop extrusion. Cell Rep. 2016;15(9):2038–49.
Goloborodko A, Imakaev MV, Marko JF, Mirny L. Compaction and segregation of sister chromatids via active loop extrusion. Elife. 2016;5:e14864.
Goloborodko A, Marko JF, Mirny LA. Chromosome compaction by active loop extrusion. Biophys J. 2016;110(10):2162–8.
Nuebler J, Fudenberg G, Imakaev M, Abdennur N, Mirny LA. Chromatin organization by an interplay of loop extrusion and compartmental segregation. Proc Natl Acad Sci. 2018;115(29):E6697–706.
Banigan EJ, Mirny LA. Loop extrusion: theory meets single-molecule experiments. Curr Opin Cell Biol. 2020;64:124–38.
Davidson IF, Peters J-M. Genome folding through loop extrusion by SMC complexes. Nat Rev Mol Cell Biol. 2021;22(7):445–64.
Oudelaar AM, Higgs DR. The relationship between genome structure and function. Nat Rev Genet. 2021;22(3):154–68.
Fudenberg G, Abdennur N, Imakaev M, Goloborodko A, Mirny LA. Emerging evidence of chromosome folding by loop extrusion. In: Cold Spring Harbor symposia on quantitative biology, vol. 82. Woodbury: Cold Spring Harbor Laboratory Press; 2017. p. 45–55.
Hsieh T-HS, Weiner A, Lajoie B, Dekker J, Friedman N, Rando OJ. Mapping nucleosome resolution chromosome folding in yeast by micro-C. Cell. 2015;162(1):108–19.
Hsieh T-HS, Fudenberg G, Goloborodko A, Rando OJ. Micro-C XL: assaying chromosome conformation from the nucleosome to the entire genome. Nat Methods. 2016;13(12):1009–11.
Ohno M, Ando T, Priest DG, Kumar V, Yoshida Y, Taniguchi Y. Sub-nucleosomal genome structure reveals distinct nucleosome folding motifs. Cell. 2019;176(3):520–34.
Wiese O, Marenduzzo D, Brackley CA. Nucleosome positions alone can be used to predict domains in yeast chromosomes. Proc Natl Acad Sci. 2019;116(35):17307–15.
Oberbeckmann E, Quililan K, Cramer P, Oudelaar AM. In vitro reconstitution of chromatin domains shows a role for nucleosome positioning in 3D genome organization. Nat Genet. 2024;56(3):483–92.
Rao SS, Huang S-C, Glenn St Hilaire B, Engreitz JM, Perez EM, Kieffer-Kwon K-R, Sanborn AL, Johnstone SE, Bascom GD, Bochkov ID, Huang X, Shamim MS, Shin J, Turner D, Ye Z, Omer AD, Robinson JT, Schlick T, Bernstein BE, Casellas R, Lander ES, Aiden EL. Cohesin loss eliminates all loop domains. Cell. 2017;171(2):305–20.
Wutz G, Várnai C, Nagasaka K, Cisneros DA, Stocsits RR, Tang W, Schoenfelder S, Jessberger G, Muhar M, Hossain MJ, Walther N, Koch B, Kueblbeck M, Ellenberg J, Zuber J, Fraser P, Peters J-M. Topologically associating domains and chromatin loops depend on cohesin and are regulated by CTCF, WAPL, and PDS5 proteins. EMBO J. 2017;36(24):3573–99.
Costantino L, Hsieh T-HS, Lamothe R, Darzacq X, Koshland D. Cohesin residency determines chromatin loop patterns. Elife. 2020;9:e59889.
Davidson IF, Bauer B, Goetz D, Tang W, Wutz G, Peters J-M. DNA loop extrusion by human cohesin. Science. 2019;366(6471):1338–45.
Kim Y, Shi Z, Zhang H, Finkelstein IJ, Yu H. Human cohesin compacts DNA by loop extrusion. Science. 2019;366(6471):1345–9.
Golfier S, Quail T, Kimura H, Brugués J. Cohesin and condensin extrude DNA loops in a cell cycle-dependent manner. Elife. 2020;9:e53885.
Higashi TL, Pobegalov G, Tang M, Molodtsov MI, Uhlmann F. A Brownian ratchet model for DNA loop extrusion by the cohesin complex. Elife. 2021;10:e67530.
Nora EP, Goloborodko A, Valton A-L, Gibcus JH, Uebersohn A, Abdennur N, Dekker J, Mirny LA, Bruneau BG. Targeted degradation of CTCF decouples local insulation of chromosome domains from genomic compartmentalization. Cell. 2017;169(5):930–44.
Kentepozidou E, Aitken SJ, Feig C, Stefflova K, Ibarra-Soria X, Odom DT, Roller M, Flicek P. Clustered CTCF binding is an evolutionary mechanism to maintain topologically associating domains. Genome Biol. 2020;21:1–19.
Ong C-T, Corces VG. CTCF: an architectural protein bridging genome topology and function. Nat Rev Genet. 2014;15(4):234–46.
Wang C, Liu C, Roqueiro D, Grimm D, Schwab R, Becker C, Lanz C, Weigel D. Genome-wide analysis of local chromatin packing in Arabidopsis thaliana. Genome Res. 2015;25(2):246–56.
Dong Q, Li N, Li X, Yuan Z, Xie D, Wang X, Li J, Yu Y, Wang J, Ding B, Zhang Z, Li C, Bian Y, Zhang A, Wu Y, Liu B, Gong L. Genome-wide Hi-C analysis reveals extensive hierarchical chromatin interactions in rice. Plant J. 2018;94(6):1141–56.
Heger P, Marin B, Schierenberg E. Loss of the insulator protein CTCF during nematode evolution. BMC Mol Biol. 2009;10:1–14.
Kaushal A, Mohana G, Dorier J, Özdemir I, Omer A, Cousin P, Semenova A, Taschner M, Dergai O, Marzetta F, Iseli C, Eliaz Y, Weisz D, Shamim MS, Guex N, Lieberman Aiden E, Gambetta MC. CTCF loss has limited effects on global genome architecture in Drosophila despite critical regulatory functions. Nat Commun. 2021;12(1):1011.
Schalbetter SA, Fudenberg G, Baxter J, Pollard KS, Neale MJ. Principles of meiotic chromosome assembly revealed in S. cerevisiae. Nat Commun. 2019;10(1):4795.
Mizuguchi T, Fudenberg G, Mehta S, Belton J-M, Taneja N, Folco HD, FitzGerald P, Dekker J, Mirny L, Barrowman J, Grewal SIS. Cohesin-dependent globules and heterochromatin shape 3D genome architecture in S. pombe. Datasets. Gene Expr Omnibus. 2014. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE56849. Accessed 7 Aug 2024.
Hsieh T-HS, Fudenberg G, Goloborodko A, Rando OJ. Micro-C XL: assaying chromosome conformation from the nucleosome to the entire genome. Datasets. Gene Expr Omnibus. 2016. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE85220. Accessed 7 Aug 2024.
Marguerat S, Schmidt A, Codlin S, Chen W, Aebersold R, Bähler J. Quantitative analysis of fission yeast transcriptomes and proteomes in proliferating and quiescent cells. Cell. 2012;151(3):671–83.
Arbona J-M, Herbert S, Fabre E, Zimmer C. Inferring the physical properties of yeast chromatin through Bayesian analysis of whole nucleus simulations. Genome Biol. 2017;18:1–15.
Ganji M, Shaltiel IA, Bisht S, Kim E, Kalichava A, Haering CH, Dekker C. Real-time imaging of DNA loop extrusion by condensin. Science. 2018;360(6384):102–5.
Nakazawa N, Sajiki K, Xu X, Villar-Briones A, Arakawa O, Yanagida M. RNA pol II transcript abundance controls condensin accumulation at mitotically up-regulated and heat-shock-inducible genes in fission yeast. Datasets. Gene Expr Omnibus. 2015. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE65956. Accessed 7 Aug 2024.
Bundschuh R, Hwa T. Statistical mechanics of secondary structures formed by random RNA sequences. Phys Rev E. 2002;65(3):031903.
Cheng TM, Heeger S, Chaleil RA, Matthews N, Stewart A, Wright J, Lim C, Bates PA, Uhlmann F. A simple biophysical model emulates budding yeast chromosome condensation. Elife. 2015;4:e05565.
Tang M, Pobegalov G, Tanizawa H, Chen ZA, Rappsilber J, Molodtsov M, Noma K-I, Uhlmann F. Establishment of dsDNA-dsDNA interactions by the condensin complex. Mol Cell. 2023;83(21):3787–800.
Levo M, Raimundo J, Bing XY, Sisco Z, Batut PJ, Ryabichko S, Gregor T, Levine MS. Transcriptional coupling of distant regulatory genes in living embryos. Nature. 2022;605:754–60.
Schalbetter SA, Fudenberg G, Baxter J, Pollard KS, Neale MJ. Principles of meiotic chromosome assembly revealed in S. cerevisiae. Datasets. Gene Expr Omnibus. 2019. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE127940. Accessed 7 Aug 2024.
Ito M, Kugou K, Fawcett JA, Mura S, Ikeda S, Innan H, Ohta K. Meiotic recombination cold spots in chromosomal cohesion sites. Datasets. Gene Expr Omnibus. 2013. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE52863. Accessed 7 Aug 2024.
Haarhuis JH, van der Weide RH, Blomen VA, Yánẽz-Cuna JO, Amendola M, van Ruiten MS, Krijger PH, Teunissen H, Medema RH, van Steensel B, Brummelkamp TR, de Wit E, Rowland BD. The cohesin release factor WAPL restricts chromatin loop extension. Cell. 2017;169(4):693–707.
Dauban L, Montagne R, Thierry A, Lazar-Stefanita L, Bastié N, Gadal O, Cournac A, Koszul R, Beckouët F. Regulation of cohesin-mediated chromosome folding by Eco1 and other partners. Mol Cell. 2020;77(6):1279–93.
Barton RE, Massari LF, Robertson D, Marston AL. Eco1-dependent cohesin acetylation anchors chromatin loops and cohesion to define functional meiotic chromosome domains. Datasets. Gene Expr Omnibus. 2021. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE185021. Accessed 7 Aug 2024.
Costantino L, Hsieh T-HS, Lamothe R, Darzacq X, Koshland D. Cohesin residency determines chromatin loop patterns. Datasets. Gene Expr Omnibus. 2020. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE151416. Accessed 7 Aug 2024.
Costantino L, Hsieh T-HS, Lamothe R, Darzacq X, Koshland D. Cohesin residency determines chromatin loop patterns. Datasets. Gene Expr Omnibus. 2020. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE151553. Accessed 7 Aug 2024.
Gullerova M, Proudfoot NJ. Cohesin complex promotes transcriptional termination between convergent genes in S. pombe. Cell. 2008;132(6):983–95.
Schmidt CK, Brookes N, Uhlmann F. Conserved features of cohesin binding along fission yeast chromosomes. Genome Biol. 2009;10(5):1–16.
Lengronne A, Katou Y, Mori S, Yokobayashi S, Kelly GP, Itoh T, Watanabe Y, Shirahige K, Uhlmann F. Cohesin relocation from sites of chromosomal loading to places of convergent transcription. Nature. 2004;430(6999):573–8.
Glynn EF, Megee PC, Yu H-G, Mistrot C, Unal E, Koshland DE, DeRisi JL, Gerton JL. Genome-wide mapping of the cohesin complex in the yeast Saccharomyces cerevisiae. PLoS Biol. 2004;2(9):e259.
Ocampo-Hafalla M, Munõz S, Samora CP, Uhlmann F. Evidence for cohesin sliding along budding yeast chromosomes. Open Biol. 2016;6(6):150178.
Jeppsson K, Sakata T, Nakato R, Milanova S, Shirahige K, Björkegren C. Cohesin-dependent chromosome loop extrusion is limited by transcription and stalled replication forks. Sci Adv. 2022;8(23):eabn7063.
Engel SR, Wong ED, Nash RS, Aleksander S, Alexander M, Douglass E, Karra K, Miyasato SR, Simison M, Skrzypek MS, Weng S, Cherry JM. New data and collaborations at the Saccharomyces Genome Database: updated reference genome, alleles, and the Alliance of Genome Resources. Genetics. 2022;220(4):iyab224.
Rutherford KM, Lera-Ramírez M, Wood V. PomBase: a Global Core Biodata Resource-growth, collaboration, and sustainability. Genetics. 2024;227(1):iyae007.
Ciosk R, Shirayama M, Shevchenko A, Tanaka T, Toth A, Shevchenko A, Nasmyth K. Cohesin’s binding to chromosomes depends on a separate complex consisting of Scc2 and Scc4 proteins. Mol Cell. 2000;5(2):243–54.
Takahashi TS, Yiu P, Chou MF, Gygi S, Walter JC. Recruitment of Xenopus Scc2 and cohesin to chromatin requires the pre-replication complex. Nat Cell Biol. 2004;6(10):991–6.
Makrantoni V, Marston A. Cohesin and chromosome segregation. Curr Biol. 2018;28:R688–93.
Schmidt CK, Brookes N, Uhlmann F. Conserved features of cohesin binding along fission yeast chromosomes. Datasets. Gene Expr Omnibus. 2008. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE13517. Accessed 7 Aug 2024.
Peters J-M, Tedeschi A, Schmitz J. The cohesin complex and its roles in chromosome biology. Genes Dev. 2008;22(22):3089–114.
Chapard C, Jones R, van Oepen T, Scheinost JC, Nasmyth K. Sister DNA entrapment between juxtaposed SMC heads and kleisin of the cohesin complex. Datasets. Gene Expr Omnibus. 2018. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE120138. Accessed 7 Aug 2024.
Mattingly M, Seidel C, Munöz S, Hao Y, Zhang Y, Wen Z, Florens L, Uhlmann F, Gerton JL. Mediator recruits the cohesin loader Scc2 to RNA Pol II-transcribed genes and promotes sister chromatid cohesion. Datasets. Gene Expr Omnibus. 2022. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE201193. Accessed 7 Aug 2024.
Bastié N, Chapard C, Dauban L, Gadal O, Beckouet F, Koszul R. Smc3 acetylation, Pds5 and Scc2 control the translocase activity that establishes cohesin-dependent chromatin loops. Nat Struct Mol Biol. 2022;29(6):575–85.
Garcia P, Fernandez-Hernandez R, Cuadrado A, Coca I, Gomez A, Maqueda M, Latorre-Pellicer A, Puisac B, Ramos FJ, Sandoval J, Esteller M, Mosquera JL, Rodriguez J, Pié J, Losada A, Queralt E. Disruption of NIPBL/Scc2 in Cornelia de Lange Syndrome provokes cohesin genome-wide redistribution with an impact in the transcriptome. Datasets. Gene Expr Omnibus. 2020. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE145966. Accessed 7 Aug 2024.
Brandão HB, Paul P, van den Berg AA, Mirny LA. RNA polymerases as moving barriers to condensin loop extrusion. Proc Nat Acad Sci. 2019;116:20489–99.
Kakui Y, Rabinowitz A, Barry DJ, Uhlmann F. Condensin-mediated remodeling of the mitotic chromatin landscape in fission yeast. Nat Genet. 2017;49(10):1553–7.
Paul MR, Markowitz TE, Hochwagen A, Ercan S. Condensin depletion causes genome decompaction without altering the level of global gene expression in Saccharomyces cerevisiae. Genetics. 2018;210(1):331–44.
Kakui Y, Barrington C, Barry DJ, Gerguri T, Fu X, Bates PA, Khatri BS, Uhlmann F. Fission yeast condensin contributes to interphase chromatin organization and prevents transcription-coupled DNA damage. Genome Biol. 2020;21(1):1–25.
Pradhan B, Kanno T, Umeda Igarashi M, Loke MS, Baaske MD, Wong JSK, Jeppsson K, Björkegren C, Kim E. The Smc5/6 complex is a DNA loop-extruding motor. Nature. 2023;616(7958):843–8.
Busslinger GA, Stocsits RR, Van Der Lelij P, Axelsson E, Tedeschi A, Galjart N, Peters J-M. Cohesin is positioned in mammalian genomes by transcription, CTCF and Wapl. Nature. 2017;544(7651):503–7.
Banigan EJ, Tang W, van den Berg AA, Stocsits RR, Wutz G, Brandão HB, Busslinger GA, Peters J-M, Mirny LA. Transcription shapes 3D chromatin organization by interacting with loop extrusion. Proc Natl Acad Sci. 2023;120(11):e2210480120.
Davidson IF, Goetz D, Zaczek MP, Molodtsov MI, Huis in’t Veld PJ, Weissmann F, Litos G, Cisneros DA, Ocampo-Hafalla M, Ladurner R, et al. Rapid movement and transcriptional re-localization of human cohesin on DNA. EMBO J. 2016;35(24):2671–85.
DiMarzio EA. Proper accounting of conformations of a polymer near a surface. J Chem Phys. 1965;42(6):2101–6.
Casassa EF. Equilibrium distribution of flexible polymer chains between a macroscopic solution phase and small voids. J Polym Sci B Polym Lett. 1967;5(9):773–8.
Allegra G, Colombo E. Polymer chain between attractive walls: a second-order transition. J Chem Phys. 1993;98(9):7398–404.
Mizuguchi T, Barrowman J, Grewal SI. Chromosome domain architecture and dynamic organization of the fission yeast genome. FEBS Lett. 2015;589(20):2975–86.
Schreiner SM, Koo PK, Zhao Y, Mochrie SG, King MC. The tethering of chromatin to the nuclear envelope supports nuclear mechanics. Nat Commun. 2015;6(1):7159.
Banigan EJ, van den Berg AA, Brandão HB, Marko JF, Mirny LA. Chromosome organization by one-sided and two-sided loop extrusion. Elife. 2020;9:e53558.
Bailey MLP, Surovtsev I, Williams JF, Yan H, Yuan T, Li K, Duseau K, Mochrie SGJ, King MC. Loops and the activity of loop extrusion factors constrain chromatin dynamics. Mol Biol Cell. 2023;34(8):ar78.
Yuan T, Yan H, Bailey MLP, Williams JF, Surovtsev I, King MC, Mochrie SG. Effect of loops on the mean-square displacement of Rouse-model chromatin. Phys Rev E. 2024;109(4):044502.
Review Commons Report 1. Early Evidence Base. 2024. https://doiorg.publicaciones.saludcastillayleon.es/10.15252/rc.2024971669.
Review Commons Report 2. Early Evidence Base. 2024. https://doiorg.publicaciones.saludcastillayleon.es/10.15252/rc.2024860940.
Review Commons Authors’ Response. Early Evidence Base. 2024. https://doiorg.publicaciones.saludcastillayleon.es/10.15252/rc.2024133506.
Yuan T, Yan H. CCLE. GitHub. 2024. https://github.com/bigpaul97/CCLE. Accessed 7 Aug 2024.
Yuan T, Yan H. Conserved-current loop extrusion (CCLE). Zenodo. 2024. https://doiorg.publicaciones.saludcastillayleon.es/10.5281/zenodo.13901225.
Review history
This article was first peer-reviewed at Review Commons and reviewer reports and point-by-point response are available online [105,106,107]. The rest of the review history containing the authors’ responses and additional reviewer comments are available as Additional File 3.
Peer review information
Veronique van den Berghe was the primary editor of this article at Genome Biology and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Funding
This research was supported by NSF EFRI CEE Award EFMA-1830904, NSF PoLS 2412859, and Yale’s Integrated Graduate Program in Physical and Engineering Biology (PEB).
Author information
Authors and Affiliations
Contributions
T.Y. designed the research, developed theory, organized data, performed data analysis, and wrote the simulation code and the paper; H.Y. designed the research and wrote the simulation code and the paper; K.C.L. designed the research and wrote the paper; I.S. designed the research and wrote the paper; M.C.K. designed the research and wrote the paper; S.G.J.M. designed the research, developed theory, and wrote the paper.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Yuan, T., Yan, H., Li, K.C. et al. Cohesin distribution alone predicts chromatin organization in yeast via conserved-current loop extrusion. Genome Biol 25, 293 (2024). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13059-024-03432-2
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13059-024-03432-2