- Research
- Open access
- Published:
Rewiring of SINE-MIR enhancer topology and Esrrb modulation in expanded and naive pluripotency
Genome Biology volume 26, Article number: 107 (2025)
Abstract
Background
The interplay between 3D genomic structure and transposable elements (TE) in regulating cell state-specific gene expression program is largely unknown. Here, we explore the utilization of TE-derived enhancers in naïve and expanded pluripotent states by integrative analysis of genome-wide Hi-C-defined enhancer interactions, H3K27ac HiChIP profiling and CRISPR-guided TE proteomics landscape.
Results
We find that short interspersed nuclear elements (SINEs) are the more involved TEs in the active chromatin and 3D genome architecture. In particular, mammalian-wide interspersed repeat (MIR), a SINE family member, is highly associated with naïve-specific genomic interactions compared to the expanded state. Primarily, in the naïve pluripotent state, MIR enhancer is co-opted by ESRRB for naïve-specific gene expression program. This ESRRB and MIR enhancer interaction is crucial for the formation of loops that build a network of enhancers and super-enhancers regulating pluripotency genes. We demonstrate that loss of a ESRRB-bound MIR enhancer impairs self-renewal. We also find that MIR is co-bound by structural protein complex, ESRRB-YY1, in the naïve pluripotent state.
Conclusions
Altogether, our study highlights the topological regulation of ESRRB on MIR in the naïve potency state.
Background
The spatial organization of the genome has been suggested as a contributory layer of epigenetic control to gene regulation and cell fate decision [1, 2]. The importance of this mode of regulation in early embryonic cells has been well-established [3, 4, 5]. Development of highly sensitive Hi-C methods allows probing into the 3D genome organization in early embryos. During mouse early embryonic development, 3D genome structuring into topologically associated domains (TADs) and loops gradually occur [6, 7]. Inspection at the allelic level showed that this structuring was parental-specific and associated with repressive H3 K27 me3 histone mark prior to the 64-cell stage [8, 9]. Formation of such structures is affected by DNA replication [6] and loop extrusion by cohesin [10]. Despite advances in highly sensitive and low-input methodologies, experimental perturbations are typically still hindered by the low cell number of early embryonic stages, and therefore the in-vitro cell models are useful.
The mouse embryonic stem cells (ESCs) are derived from the inner cell mass of a blastocysts [11], and has long served as an in-vitro model of the early embryo. In mouse ESCs, a spectrum of pluripotency states exists along the naïve-primed axis [12]. Conventional serum and leukemia inhibitory factor (LIF) culture condition maintains mouse ESC in a naïve state, while serum-free with 2 small molecule kinase inhibitors (2i) condition results in a ground pluripotency state [13], both of which are capable of embryonic but not extraembryonic cell contribution. Expanded potential stem cells (EPSC), on the other hand, maintain extraembryonic potential in vitro [14].
While the gene regulatory network of the naïve state has been well-studied [15, 16, 17], the nonlinear control of gene program through enhancer–promoter interactions is still a work in progress. Some studies have delved into the 3D genome comparison between naïve and ground state. As an example, a study found that polycomb-associated promoter interactions were depleted in ground state mESCs but were re-established when the cells were transitioned to conventional naïve state [18]. Another study discovered TEAD2 as regulator of ground state mESC genes by mediating enhancer-promoter looping [19]. Comparison between ESC and totipotent-like 2 cell-like cells (2CLC) revealed a more relaxed chromatin conformation in the latter state, and disruption of chromatin organization factors CTCF and cohesin promotes the 2CLC emergence [20]. These demonstrates that the global genome conformation and interactions mediated by loops play a crucial role in governing transitions into and out of specific cell states.
Furthermore, the role of transposable elements (TEs) in mediating genome regulation in 3D space is just emerging [21, 22]. TEs are mobile genetic sequences with rich potential to rewire gene regulatory system [23, 24]. In mESC, complex epigenetic regulation of TEs exists [25, 26]. In the past years, several studies have reported the co-option of TEs as functional enhancers [27, 28, 29, 30, 31] in ESCs, indicating that TE-enhancers play an important role during early development. Enhancers modulation by combinatorial action of transcription factors (TF) allows cell-specific gene expression program, thus conferring cell identity [32, 33, 34]. To the best of our knowledge, de-novo discovery of TE regulators at genomic level has only been reported for LTR7/HERV-H [35]. Other studies have primarily focused on the TE transcript regulators, such as LINE1 RNA binding proteins [36, 37]. Consequently, the interplay between TE and TF in shaping the 3D genome conformation remained largely unknown [38].
In this study, we dissect the role of TE-associated enhancers in the spatially organized genome of distinctive embryonic potency states. We uncovered a high level of interactions involving mammalian-wide interspersed repeats (MIR)-enhancers in the naïve pluripotent versus expanded states. MIR is an ancient TE family [39] that belongs to the short-interspersed nuclear element (SINE) retrotransposon. Several reports have suggested their present-day role [40, 41, 42, 43, 44, 45]. MIR sequence is highly conserved [46], and its transcription is cell-type specific [47, 48]. Given the cell-type specificity and the persistence of MIR over a lengthy evolutionary timeline, MIR may be a critical regulator of cell identity and could be the foundation of mammalian-specific regulatory network [49].
We found that the enhancer activity of a subset of MIR that regulates ESC-specific genes was under the control of the Estrogen-related receptor b (Esrrb), an important TF critical for pluripotency and self-renewal [50, 51, 52, 53]. Esrrb was previously implicated in mouse early development and reprogramming [54]. It is also a key gene in regulating exit from naïve pluripotency and differentiation [55, 56, 57, 58, 59]. Experimentally, we demonstrated that loss of an ESRRB-bound MIR resulted in downregulation of its target gene, and impaired self-renewal. We further demonstrated that YY1, a ubiquitous structural protein [60], partners ESRRB on MIR to coordinate long-range regulation of target genes. We propose that MIR-enhancers mobilization by ESRRB-YY1 accounts for the molecular mechanism governing chromatin folding, leading to naïve-specific gene program.
Results
Distinct cell states exhibit specific 3D genome conformation
The role of 3D genome conformation, especially at TEs, in the expanded stem cell state has been less explored. Therefore, we aimed to investigate what structural features are distinct between the cell states by performing Hi-C and H3 K27ac HiChIP in the naïve and expanded states of mouse embryonic stem cells (Fig. 1A). The EPSC state was derived from the ESC through maintenance in EPSC culture media conditions [14, 61] and characterized by its marker gene expression, namely the upregulation of H19 and downregulation of Axin2 (Fig. S1A). We also confirmed that AXIN1 protein, whose overexpression in ESC can drive EPSC-like features [14], is up to fourfold upregulated in our EPSC (Fig. S1B, Fig. S1C). Then, we subjected the EPSCs to blastoid formation protocol and verified that our EPSC can form blastoids (Fig. S1D), signifying their embryonic- and extra-embryonic competence. Immunofluorescence staining of the EPSC-derived blastoids showed CDX2 expression at the outer, trophectoderm-like layer of the blastoid, and OCT4 expression at the inner cell mass (Fig. S1E).
Differential 3D genome landscape profile of ESCs versus EPSCs. A Schematic illustration of the in vitro models used for studying the 3D genome of distinct pluripotent states, expanded pluripotent stem cells (EPSC), and naïve embryonic stem cells (ESC). TE-centric genome analysis was performed by 2 strategies. Firstly by combined analysis of genome-wide Hi-C with enhancer-mark H3 K27ac ChIP-seq, and secondly by enhancer-centric H3 K27ac HiChIP. The genome looping surrounding TE and accessible regions defined by ATAC-seq were considered. B Differential compartments identified from pairwise comparison between ESC and EPSC. The colors represent six different types of compartment transitions. Within compartment A, we distinguish between high-strength to low-strength (dark red, n = 108) and low-strength to high-strength (light red, n = 110) transitions. There are 44 transitions from compartment A-to-B, and 923 transitions from compartment B-to-A. Within compartment B, we distinguish between high-strength to low-strength (light green, n = 26) and low-strength to high-strength (dark green, n = 87). C Expression fold change of genes undergoing compartment switch. D Number of differential boundaries identified through pairwise comparisons by TADCompare package. The differential boundaries were classified into 5 categories. E Scatterplot of EPSC up- (red dots) and downregulated (blue dots) genes located within differential TAD. Several examples of EPSC upregulated genes were labelled out. The number of upregulated genes located within each category of differential TAD was shown in the bar chart. F Hi-C heatmap showing TAD reorganization around Tfap2a gene locus. G Pile-up analyses of significantly identified chromatin loops using Hi-C data of ESC and EPSC. Number of chromatin loops for each potency state was labelled. H Number of differentially expressed genes enriched in the anchors of specific or consistent chromatin loops. I H3 K27ac binding signal and Hi-C interactions around EPSC-specific gene H19. J Boxplot showing the decrease of contacts at the ESC-loops in EPSC. The signal was quantified using H3 K27ac HiChIP. K H3 K27ac binding signal and HiChIP interactions around EPSC-downregulated gene Axin2
After confirming the EPSC state, we first performed Hi-C and compared the EPSC with the ESC. We determined that the quality of our EPSC and ESC Hi-C data were similar in terms of the unmapped reads, PCR duplicates, self-ligation, intra-contacts, and inter-contacts (Fig. S1F). Our PCA analysis of Hi-C samples showed that the 3D genome conformation of EPSC clustered closer to the 8-cell stage, while ESC closer to the ICM (Fig. S1G). We next analyzed the changes in the genome folding of EPSC in comparison to the ESC state. It has been proposed that the genome is compartmentalized into regions comprising of more active (A compartment) or inactive (B compartment) genes. Here, we observed a major B-to-A compartment switch in ESC versus EPSC (Fig. 1B), in accordance with the report that cells of higher developmental potency are associated with more relaxed conformation [20]. The expression of genes within the compartment undergoing B-to-A switch showed an upregulation trend, while A-to-B compartment genes showed a slight downregulation (Fig. 1C). This suggested that compartment switching occurred in a cell undergoing transition into a different state and that this change, especially the B-to-A switch, predicts the gene expression change. GO-term enrichment analysis of the B-to-A gene set revealed an enrichment of genes related to bile acid and lipid metabolism, for example Carboxylesterase (Ces) and ATP-binding cassette (ABC) family members (Fig. S1H). Cyp2r1, a vitamin D hydroxylase, was also found in the top-enriched GO-term. Interestingly, a study revealed that vitamin D treatment together with inhibition of histone methyltransferase MLL1 enhances the extraembryonic differentiation capacity of ESC [62]. Additionally, activation of a nuclear receptor involved in lipid metabolism in ESCs can enhance their potency, enabling blastoid formation [63]. Therefore, our compartment analysis supports that these metabolic genes may become activated in EPSCs and contribute to their expanded potential.
We then examined the topologically associated domains (TADs) and showed that although both shared many common TADs, state-specific TADs were formed (Fig. S1I). Changes in TAD boundaries can be classified into 5 types (complex, merge, shifted, split, and strength change). Complex, split, and merge boundary changes are considered to be a major change, as they cause the most disruptive changes, while shifted and strength changes are considered to be minor changes [64]. Analysis of differential TAD boundaries between EPSC versus ESC state showed that among the 5 types of boundary changes, complex TAD change occurred more (Fig. 1D). To answer whether these differential TADs may be consequential for state identity, we identified the differentially expressed genes between ESC and EPSC, and found that a significant number of them were located within the TAD that underwent complex or split changes (Fig. 1E). This suggested that EPSC-specific genes could be under the control of TAD-level gene regulation. One such example may be the Tfap2a gene, which was constrained within a TAD in EPSC, but not in ESC (Fig. 1F).
Next, we inspect looping differences between ESC and EPSC. Since the loop numbers depend on the sequencing depth, we first ensured that the sequencing depth is comparable across all libraries (Table S1). At the loop level, we detected a slightly higher interaction strength in ESC versus EPSC (Fig. 1G) and identified that the loops displayed state specificity (Fig. S1J). The number of differentially regulated genes was almost twice higher in the state-specific loops compared to the consistent loop, suggesting that the state-specific loops regulate the expression of more state-specific genes (Fig. 1H). Indeed, we observed a more complex looping surrounding H19, which is the marker gene for EPSC (Fig. 1I). To further investigate the enhancer-specific landscape at loop resolution, we performed H3 K27ac HiChIP. We found a slightly higher H3 K27ac HiChIP looping in the ESC state (Fig. 1J). Interestingly, the enhancer looping surrounding Axin2, a downregulated gene in EPSC, was diminished in EPSC (Fig. 1K).
Taken together, our 3D genome profiling highlighted the distinct genome conformation, at the compartment, TAD, and loop level, in the naïve versus expanded state.
SINE-enhancers were involved in cell state identity
We next questioned whether transposable elements are involved in the genome conformation of each potency state. First, we profiled the TE families that are enriched in the higher-order genome structures (compartments, TADs, and loops), accessible regions, and regions with H3 K27ac enhancer histone mark (Fig. S2A) and found that SINEs (B1/Alu, B2, B4, MIR) were largely accessible, displayed enhancer mark, and heavily involved in chromatin loop formation. This was in contrast to LINE1 (L1), which was poorly accessible and enriched in B compartment, in agreement with a published report [65]. Since a cell identity is predominantly governed by its enhancer activity, we further constrained the analysis to the enrichment at enhancer loop anchors. To this end, our first strategy was to jointly analyze Hi-C and H3 K27ac ChIP-seq data to define enhancer loop anchors (Fig. 1A). We found that SINE families and several ERV families were up to fourfold enriched in the enhancer loop anchors of naïve (ESC) pluripotent state (Fig. 2A). SINE B1/Alu and MIR were the top represented TE families with the highest enhancer loop involvement in ESC. Both TE families were decorated with H3 K27ac enhancer histone mark, whereas MIR was more accessible than B1/Alu in ESC (Fig. 2B). These ESC-MIR-enhancer loop anchors were less accessible and less endowed with H3 K27ac mark in the EPSC (Fig. 2C). We noted that the number of H3 K27ac-positive MIRs was lesser in EPSC than ESC (417 versus 1053). Furthermore, the interaction occurring at ESC-MIR-enhancer loop anchors was less frequent in EPSC, indicating diminished importance of these loops in the expanded potency state (Fig. 2D).
Comparative analysis of TE-derived 3D genome configuration in the in vitro model of early embryonic cells revealed higher SINE-looping in the naive state. A TE family enrichment in enhancer loop anchors of ESC and EPSC. Enhancer loop anchors were defined by the presence of H3 K27ac ChIP-seq peaks at Hi-C loop anchors. Color represents fold enrichment (see “Methods”). *, FDR < 0.05; **, FDR < 0.01; ***, FDR < 0.001. B Metaplot of H3 K27ac and ATAC-seq signal over Alu- and MIR-enhancer loop anchors in ESC. The number of Alu-derived and MIR-derived enhancer loop anchors were 1254 and 175, respectively. C Comparison of ATAC-seq and H3 K27ac ChIP-seq signal on MIR-enhancer loop anchors in ESC and EPSC. D Pileup Hi-C maps at ESC MIR-enhancer loops showed reduced interactions in EPSC state. E TE-enhancer enrichment within state-specific loop domain. Only significantly enriched TE family with p-value < 0.01, fold change (FC) of enrichment > 1 and number of involved specific loops > 100 were shown. Colour of the bar indicates the enrichment significance. Chi-squared test was used to determine the P-value (see “Methods”). F Enriched GO-terms for genes involved in ESC-specific MIR-loops
Aside from point-to-point contact, communication between enhancers and other regions may occur when they are confined within a domain. To test the notion, we identified TE families with significant enrichment in a state-specific Hi-C loop domain (Fig. 2E, Table S2). The result revealed that more TE families, including MIR, were involved in the ESC-specific loop domain. Interestingly, genes near the ESC-MIR loops were those involved in the regulation of developmental process (Fig. 2F).
To confirm our findings, we employed H3 K27ac HiChIP as a second approach (Fig. 1A). By employing this enhancer-targeted strategy, our aim was to avoid overlooking any enhancer-promoter interactions caused by insufficient Hi-C resolution. In agreement with the first analysis, we observed SINE family (Alu/B1, B2, MIR, B4) enrichment in the H3 K27ac HiChIP loop anchors (Fig. S2B). Among these SINEs, MIR was the most enriched in the H3 K27ac-mediated enhancer loops in ESC compared to EPSC (Fig. S2 C). Consistent with our previous finding, the ESC-MIR-enhancer loop anchors, this time defined by H3 K27ac HiChIP, were less accessible and less endowed with the H3 K27ac mark in the EPSC (Fig. S2D).
Together, our data suggested that SINE-enhancers are topologically wired differently and have a more prominent involvement in the naïve pluripotent state compared to the expanded state.
Esrrb is a candidate regulator of MIR-enhancers
Next, we sought to identify the TFs that can architecturally regulate TE-enhancers in ESC, with the focus on MIR. We did so by analyzing the TF motifs enriched in the enhancers and enhancer-loop anchor, with and without TEs, while considering the TF gene expression. Incorporating Hi-C loop anchor information helped us to predict whether the TF with enriched motif was involved in shaping genome topology. At the enhancer and TE-enhancers, the motifs of several members of the Klf and nuclear receptor family were enriched (Fig. S3A). For enhancer-loop anchor involving MIR, Esrrb appeared as the top-enriched motif, followed by the closely related nuclear receptor Nr5a2 (Fig. 3A). This motif enrichment was higher compared to the enhancers containing other TE families such as B2, ERVK, and ERV1, except for B1/Alu (Fig. 3B). Moreover, motif density analysis on MIR-enhancer sequences showed that Esrrb motif is positioned at the center, specifically at the 86–97 bp of the MIR consensus sequence (Fig. 3C), which falls under the highly conserved 65 bp core-SINE [66] region (Fig. S3B). This indicated that Esrrb motif has been conserved through evolutionary selection, implying a biologically functional role. We confirmed this by analyzing the motifs enriched in the H3 K27ac HiChIP loop anchors involving MIR. As expected, the nuclear receptor family appeared as the top GO (Fig. S3 C), with Esrrb motif ranked second after Zfp768, whose human ortholog ZNF768 was reportedly recruited to MIR for cell-type-specific gene expression [40] (Fig. 3D).
Motif enrichment analysis and MIR-wide proteome profiling uncovered Esrrb as candidate MIR regulator. A TF motif enrichment on MIR-enhancer loop anchors. Size and colour indicate the enrichment significance and expression level in ESC. B The significance of Esrrb motif enrichment on MIR-enhancers compared to other TE-enhancers and poised enhancers. Significance is expressed as − log10 (adjusted-p-value). p-value was computed using binomial test. Enhancers were initially defined by ATAC-seq and histone mark ChIP-seq signals. C Histogram of Esrrb motif density on MIR-enhancers from the center to the ± 1 kb region. D Motif enrichment comparison between ESC and EPSC for MIRs located in the H3 K27ac HiChIP loop anchors. p-value was computed using binomial test. E Schematic of dCas9-mediated protein capture using MIR-targeting sgRNAs. Proteomic analysis was performed to identify MIR-associated proteins. F Scatterplot of average abundance ratio for significantly identified MIR-associated proteins using MIR-targeting sgRNA (abundance ratio > 1.5). Selected proteins were labelled out. G Gene Ontology terms of significantly identified MIR-associated proteins. The significance was represented using − log10(p-value). H Protein–protein interaction network of ESC-specific MIR-associated proteins comparing to RA. Selected proteins were labelled out. Edge represents the interaction information obtained from STRING database. I Overlap of MIR-enriched proteins with the gene motifs enriched in MIRs involved in enhancer loop anchors
Next, we utilized CAPTURE2.0 [67], a CRISPR-based affinity purification method, to profile the MIR-specific proteomics landscape in ESC (Fig. 3E). First, we generated a clonal line with transgene integration and constitutive expression of dCas9 with biotin-tag receptor, as confirmed by gDNA-PCR, IF staining, and western blot (Fig. S3D, Fig. S3E, Fig. S3F). As a verification of cell line functionality, we were able to detect dCas9 binding at the targeted Oct4 genomic region following the addition of Oct4 promoter (sgOct4-PP) and enhancer (sgOct4-PE)-specific sgRNAs (Fig. S3G). Afterward, we established dCas9-biotin-positive cell lines expressing broad targeting MIR sgRNA (MIR sg22 and MIR sg23) and control (sgGal4) (Fig. S3H). MIR sg22 and MIR sg23 were designed based on the MIR consensus sequence (see “Methods”). We mapped the genome-wide binding sites and distribution of the dCas9-sgMIR complex using Cas9 ChIP-seq (Fig. S3I, Fig. S3J). ChIP-seq confirmed that the sgRNAs successfully targeted and enriched for binding on MIR sites (Fig. S3K).
Since we aimed to find MIR-enriched factors that support ESC state, we employed a retinoic-acid (RA)-based differentiation assay and contrasted the MIR-proteome landscape of ESC with RA-induced differentiated cells (Fig. S3L). RA is a widely used compound to induce neural lineage differentiation from pluripotent stem cells [68]. In our experiment, addition of RA for 48 h decreased Oct4 gene expression and increased the expression of neural lineage marker Pax6 and Gbx2 (Fig. S3M). We identified a total of 358 MIR-enriched proteins that are ESC-specific (Fig. S3N). Known pluripotency factors (e.g., ESRRB, NR0B1, KLF5, SOX2, and OCT4) were enriched in the ESC-specific group, while epigenetic regulators (e.g., ARID1A, MBD3, ACTL6A, YY1), and RNA polymerase subunits (POLR2A, POLR2B) were found in both (Fig. S3O). We identified proteins that were consistently enriched in both MIR-targeting sgRNAs (Fig. 3F, Table S3) and found that these proteins were involved in mechanisms associated with chromatin organization and pluripotency (Fig. 3G). We validated that some of these factors were indeed enriched on ESRRB-bound MIRs based on publicly available ChIP-seq data (Fig. S3P). Protein–protein interaction network showed a cluster of core pluripotency factors (POU5F1, SOX2, ESRRB, NR0B1) enriched on MIR in ESC state, but not in the differentiated state (Fig. 3H). Therefore, we uncovered a myriad of proteins with potential regulatory action on MIR and consequential effects on accentuating the ESC-specific state. Among these, Esrrb appeared in the overlap of MIR-enhancer enriched motifs and MIR-enriched proteins (Fig. 3I). This puts Esrrb forward as a prime candidate for MIR regulator in ESC.
ESRRB displayed lesser MIR occupancy in the expanded state
Next, we aimed to explore whether ESRRB regulates MIR and whether it did so differentially in the naïve versus expanded states. We first confirmed that Esrrb is expressed similarly at the mRNA and protein level between the 2 cell states (Fig. 4A, B). With regard to its binding profile, ESRRB ChIP-seq reveals that 65.9% of ESRRB peaks was in TE region in ESC, but this number fell to 54.1% in EPSC (Fig. 4C). In EPSC, ESRRB occupancy on SINEs (B1/Alu, MIR, B2) was reduced but not for ERV1 and ERVK (Fig. 4D). While the ESRRB binding probability on MIR was high for both states, but the frequency was much reduced in EPSC (Fig. 4E). On MIR with active enhancer signatures, ESRRB binding was enriched in ESC but not in EPSC (Fig. 4F). Phenotypically, loss of Esrrb in ESC resulted in the loss of alkaline phosphatase and differentiated morphology, while in EPSC no noticeable difference was observed (Fig. 4G).
ESRRB displayed stronger MIR occupancy in ESC. A, B Measurement of Esrrb expression at mRNA and protein level, respectively. Unpaired T-test was used to test for significance. C Percentage of ESRRB binding on TE- and non-TE regions based on ChIP-seq. D Proportion of ESRRB peaks that fell into select TE family region. E Scatterplot showing the enrichment and number of observed copies for TE families in ESRRB peaks. TE were colored by TE class. F Average profile of ESRRB ChIP-seq binding signal on MIR-derived enhancers. The signal in a ± 3-kb window flanking the MIR center is shown. G Phenotypic consequence of Esrrb loss was depicted by changes in colony morphology and loss of alkaline phosphatase signal
Altogether, these observations suggest a diminished significance of Esrrb in the expanded state compared to the naïve state. Considering the initial finding that MIR displayed reduced accessibility, enhancer signature, and loop involvement in EPSC versus ESC, it is possible that gain of ESRRB occupancy on MIR in the ESC state is contributing to ESC maintenance. Therefore, we focused our next effort on investigating ESRRB’s involvement in regulating MIR-enhancers in ESC state.
Loss of ESRRB abrogates enhancer and super-enhancer looping in ESC
To demonstrate that ESRRB plays a crucial role in maintaining ESC-specific identity via the formation of enhancer loops, we next performed H3 K27ac HiChIP following Esrrb knockdown (Fig. 5A). We observed a phenotypic change where cells adopted differentiated morphology (Fig. S4A), and confirmed the successful depletion of Esrrb at both transcript and protein level (Fig. S4B, Fig. S4C). H3 K27ac HiChIP analysis of the Esrrb knockdown cells displayed a drastic reduction of overall H3 K27ac loops (Fig. S4D, Table S4), H3 K27ac loops that overlapped with TE-enhancer loops (Fig. S4E), and H3 K27ac loops at ESRRB-bound MIRs (Fig. 5B). Notably, the loss of H3 K27ac loops around MIR was significant, compared to random regions (Fig. S4 F). ESRRB-bound MIR interacting genes which suffered from the loss of H3 K27ac loop were associated with pluripotency mechanisms (Fig. 5C), such as Klf5, Trim25, and Esrrb (Fig. 5D). Through RNA-seq, we confirmed that these genes with lost H3 K27ac loops were downregulated (Fig. S4G, Fig. S4H), indicating that the ESRRB-mediated looping has regulatory functions.
Loss of Esrrb abrogates MIR enhancers. A Schematic illustration of the experimental system and strategy used for studying ESRRB-mediated and enhancer-related loops. H3 K27ac HiChIP maps were produced to characterize the interactome changes following Esrrb knockdown using shRNA system (Esrrb sh1 or Esrrb sh2). Empty vector (shCon) was used as control. The effect of H3 K27ac loop loss toward ESRRB-bound MIR interacting genes was examined. B Number of H3 K27ac-mediated chromatin loops at ESRRB-bound MIRs before and after Esrrb knockdown. C GO-terms enrichment of genes nearby ESRRB-bound MIRs with H3 K27ac loop loss. D WashU browser visualization of H3 K27ac loop loss around ESRRB-bound MIRs that interact with Klf5, Trim25, or Esrrb after Esrrb knockdown. H3 K27ac HiChIP signal and loops were shown in the tracks. E Distribution of H3 K27ac-mediated chromatin loops at super-enhancers (SEs) in control and Esrrb knockdown conditions. The boxplot summarizes the distribution of the number of loops and displays the median value for each group. F GSEA showing the expression change of SE-associated genes upon Esrrb knockdown in ESCs. For x-axis, genes were ranked by the expression fold change
Since the expression of genes dictating cell identity is under major control of super-enhancers (SEs) [33], we extended our analysis to SEs. We first identified 315 and 207 SEs in ESC and EPSC, respectively (Fig. S4I). In ESC SEs, we observed a major loss of H3 K27ac loops after Esrrb knockdown (Fig. 5E). Furthermore, we analyzed the expression of SE-associated genes after Esrrb knockdown. ESC-SE genes were significantly downregulated, while EPSC-SE genes were less affected (Fig. 5F). This evidence lends further support to the idea that ESRRB may regulate naïve ESC-specific cell identity and assume lesser regulatory importance in EPSC.
CRISPR inhibition of MIR
To determine whether MIR enhancer activity is important for controlling gene expression, we employed the CRISPRi targeting system which recruits repressor proteins and promote repressive histone and DNA methylation [69]. We first demonstrated the efficacy of this system by targeting Oct4 promoter region, resulting in a fourfold downregulation of Oct4 expression (Fig. S5A). To investigate the broad impact of MIR inactivation, we used the previously designed MIR sg22 and MIR sg23. Inactivation of MIRs using these broad targeting sgRNAs reduced ESRRB binding at the ESRRB-bound MIR sites (Fig. S5B). Although MIR inactivation did not result in apparent morphological change (Fig. S5C), genes under regulation of ESRRB-bound MIR were significantly downregulated with the addition of the broad targeting CRISPRi MIR sgRNAs (Fig. S5D). Among the affected genes were those involved in differentiation and embryonic development (Fig. S5E).
Next, we select an ESRRB-bound MIR locus located ~ 50 kb away from Klf5 gene, which we termed as Klf5-MIR, and inactivated it using 2 sgRNAs (Klf5-MIR sg1 and Klf5-MIR sg2) (Fig. 6A). We selected this MIR, because it displayed differential looping between ESC and EPSC (Fig. S5F) and underwent H3 K27ac HiChIP loop remodelling upon Esrrb loss (Fig. 5D). We confirmed by ChIP-qPCR that repressive histone mark H3 K9me3 was increased at Klf5-MIR after targeting this site with CRISPRi machinery (Fig. S5G). As a result of inactivation of the Klf5-MIR, the expression of Klf5 was drastically reduced (Fig. 6B). We also observed a reduction of Nanog expression (Fig. 6C). We further confirmed that ESRRB binding was reduced upon CRISPRi of Klf5-MIR (Fig. 6D). Altogether, these data demonstrated that the Klf5-MIR is a functional MIR enhancer in ESC under the control of ESRRB.
Inactivation or knockout of a MIR enhancer affects target gene expression and cell proliferation. A Schematic of CRISPR-based inactivation of Klf5-MIR. dCas9 fused with KRAB and MeCP2 effectors were guided to Klf5-MIR by 2 sgRNAs (sg1 and sg2). Two primer pairs (EK ChIP1 and EK ChIP2), annealing near the sgRNA-targeted site, were designed for ChIP-qPCR assay. B, C Expression of Klf5 and Nanog after CRISPRi of Klf5-MIR. D ESRRB binding on Klf5-MIR region after CRISPRi of Klf5-MIR, measured by ChIP-qPCR. As comparison, ESRRB-bound region outside Klf5-MIR (Positive control) and non-ESRRB-bound region (Negative control) were shown. E Schematic of CRISPR-based knockout of Klf5-MIR. Three sgRNAs (2 flanking and 1 targets internally) were designed to knockout Klf5-MIR. F Expression of Klf5 and Nanog in Klf5-MIR homozygous and heterozygous KO ESCs. G Cell proliferation of Klf5-MIR homozygous and heterozygous KO ESCs. H Expression of Klf5 and Nanog in Klf5-MIR homozygous and heterozygous KO EPSCs. I Cell proliferation of Klf5-MIR homozygous and heterozygous KO EPSCs
To further investigate whether Klf5-MIR loss has long-term consequence for ESC’s potency, we deleted the Klf5-MIR genomic region using CRISPR-Cas9. A total of 3 sgRNAs were used to target Klf5-MIR, with 2 of them flanking the Klf5-MIR and 1 targets internally (Fig. 6E). We confirmed the genotype of one homozygous and one heterozygous knockout lines by PCR (Fig. S5H). RT-qPCR of these clones showed reduction of Klf5 and Nanog gene expression, with the homozygous clone showing more drastic reduction compared to the heterozygous KO clone (Fig. 6F). This is in line with our CRISPRi result, indicating that Klf5-MIR is deterministic for its target gene expression. The homozygous KO clone was morphologically more homogenous and grew in smaller and more compact colonies compared to the WT ESC, while heterozygous KO clone has intermediate mixed phenotype (Fig. S5I). The KO clones noticeably proliferated slower than WT. Quantification of cell proliferation by MTS assay showed that the homozygous KO clone grew at half the speed of wildtype, while heterozygous KO clone displayed an intermediate growth rate (Fig. 6G).
Next, we derived EPSC from these KO lines, to observe whether the loss of Klf5-MIR affects EPSC (Fig. S5I). In contrast to their ESC counterpart, the loss of Klf5-MIR did not negatively affect the Klf5 expression and less severely affected Nanog expression in EPSC (Fig. 6H). Additionally, no difference in proliferation rate was observed in the EPSC state (Fig. 6I). These data confirmed that the Klf5-MIR does not have the same regulatory effect in the EPSC state. As such, Klf5-MIR served as an example of a MIR with diminished regulatory importance in the expanded state.
ESRRB-YY1 partnership regulate MIR
YY1 has been reported to be a general enhancer-promoter structuring protein [60]. Based on the public YY1 ChIA-PET data [60], a major portion of YY1-mediated loops involved TE regions (Fig. 7A). Furthermore, YY1 binding displayed enrichment on MIR with active enhancer signature and on ESRRB-bound MIR (Fig. 7B, C). Therefore, we sought to explore whether YY1 partners with ESRRB in the regulation of MIR in ESC state. As with Esrrb, Yy1 loss in ESC leads to differentiation and loss of alkaline phosphatase staining intensity (Fig. S6A). Based on co-immunoprecipitation, we determined that ESRRB is a direct interacting partner of YY1 (Fig. 7D). Sequential ChIP also confirmed that the two factors co-bound at the genomic localities examined, including many MIR sites (Fig. S6B). The genes regulated by ESRRB-YY1 co-bound MIRs were enriched for mechanisms associated with pluripotency (Fig. 7E). Upon ESRRB depletion, we observed reduced YY1 enrichment on the co-binding sites with ESRRB (Fig. S6 C, Table S5). On MIR sites bound by YY1 alone or in conjunction with ESRRB, the absence of ESRRB resulted in a reduction of YY1 binding (Fig. 7F, Fig. 7G, Table S5). We have measured Yy1 expression by qPCR and determined that Yy1 is not deregulated in Esrrb shRNA knockdown condition (Fig. S6D). Similarly, YY1 protein level was also not changed (Fig. S6E, Fig. S6 F). This indicated that reduction of YY1 binding on MIR is not due to reduced YY1 availability. We also saw that ESRRB and YY1 binding were positively correlated at the co-bound MIR loci (Fig. 7H). Moreover, there were 524 common downregulated genes between Esrrb and Yy1 knockdown conditions (Fig. S6G). Expectedly, the gene set regulated by the ESRRB-YY1 co-bound MIRs was significantly downregulated upon the loss of either factor (Fig. 7I). On the other hand, the control genes, defined by the genes located around 100–200 kb distance from the ESRRB-YY1 bound MIRs, did not show significant enrichment in the differentially expressed genes (Fig. 7I). These indicate that ESRRB-YY1 complex may have pluripotency gene regulation role on a subset of MIR.
ESRRB co-binds structural protein YY1 at MIR. A Percentage of YY1-mediated loops involving TE and non-TE based on YY1 ChIA-PET data. B YY1 binding enrichment on MIRs with enhancer marks in ESC. YY1 ChIA-PET and ChIP-seq data were obtained from public dataset [60]. C YY1 binding signal on ESRRB-bound MIRs. D Native YY1-ESRRB and ESRRB-YY1 co-immunoprecipitation followed by Western Blot. E Enriched gene ontology terms for ESRRB-YY1 co-bound MIR regulated genes. F, G Heatmap and metaplot showing ChIP-seq signals at YY1-bound MIRs, and ESRRB-YY1 co-bound MIRs, respectively. H Correlation of ESRRB and YY1 binding at co-bound MIR sites. I Venn diagram of genes regulated by ESRRB-YY1 co-bound MIRs after Esrrb or Yy1 knockdown in ESCs. These MIR-regulated genes were defined by the presence of H3 K27ac loops connecting co-bound MIR and gene region (TSS to gene end). Control genes were defined by genes located at around 100–200 kb distance from the ESRRB-YY1 co-bound MIRs. J Graphical summary of the findings in this study. Our TE-centric 3D genome analysis uncovered the increasing complexity of SINE-TE, especially the SINE-MIR family in ESC. Enriched Esrrb motif and binding on MIR suggests the co-option of MIR by ESRRB in forming enhancer loops, to a certain extent in partnership with YY1. Loss of Esrrb abrogates this loop formation and decreased YY1 occupancy on MIR
We further investigated whether the ESRRB-YY1 regulation through MIR could be repressive by examining genes upregulated upon Esrrb or Yy1 knockdown. A total of 90 and 92 ESRRB-YY1 MIR looping genes were upregulated upon Esrrb and Yy1 knockdown, respectively (Fig. S6H). When considering genes upregulated in both Esrrb and Yy1 knockdown conditions, the overlap with ESRRB-YY1 MIR looping genes reduced to 21 (Fig. S6I). In contrast, the downregulated gene set showed 45 overlapping genes with ESRRB-YY1 MIR looping genes and with a higher statistical significance (Fig. S6I). These suggest that the action of ESRRB-YY1 on MIR is more likely activatory than repressive.
To determine whether YY1 is the only architectural protein involved in ESRRB-mediated enhancer looping, we examined the binding peaks enrichment of several structural proteins and transcription factors at ESRRB-mediated H3 K27ac loop anchors (Fig. S6J). Structural proteins CTCF and RAD21 were lowly enriched, suggesting that they are unlikely to play a role in the ESRRB-mediated enhancer looping. In contrast, the Mediator complex of RNA Pol II transcription (MED1, MED12) was enriched. The presence of these key components of enhancer-promoter (E-P) interactions [70, 71] further supports the idea that E-P interactions occur at ESRRB-mediated H3 K27ac loop anchors.
In summary, our study highlighted the increased interaction between SINE-TEs in the ESC state. One example is SINE-MIR, which displayed enhanced accessibility and enhancer signal. This results in a more readily available Esrrb motif, which upon binding establishes long-range interaction, with or without YY1, and activates naïve pluripotency genes. Indeed, the knockdown of Esrrb resulted in loss of this interaction. This regulatory action by Esrrb is more pronounced in the ESC but not in the EPSC (Fig. 7J).
Discussion
How the genome conformation was reprogrammed to give rise to distinct potency states remain understudied. Our comparative analysis of 3D genome structure of mouse ESC in naïve and expanded states revealed that the chromatin rearrangement at loop, TAD, and compartment levels were state-specific. Another study comparing 2-cell-like-cells (2CLC) and ESC genome-wide interactions reported differential loop interaction and TAD insulation [20]. A promoter-centric study also reported reorganization of promoter interactions during ESC conversion from serum to 2i media [18]. Our study supports the notion that genome conformation rearrangement is crucial in the transition between different potency states. In recent years, large efforts have been poured to capture mouse ESC at a certain potency state in vitro, understanding their differentiation propensity and identifying their in vivo counterpart [72, 73, 74, 75, 76]. Based on our work, we see the relevance of characterizing these cell states to reveal its 3D genome topology. Leveraging on the rich presence of TF binding motif on TE, one can first computationally explore the significance of these motifs in topology function and cell fate determination [77].
Our TE-centric analysis suggested that TE-enhancers displayed state specificity. Compared to the expanded state, TE-enhancer involvement was more prominent in the naïve state. Specifically, we uncovered SINE MIR as functional ESC enhancers under direct control of ESRRB. Esrrb regulation of enhancers and SEs were reported to be substantial in naïve pluripotency [58, 78], but its significance for TE-enhancer regulation remained unknown. Previous analysis of Esrrb motif and binding showed a higher enrichment on SINEs, as compared to the other core pluripotency factors [30]. We propose that ESRRB is a TF with exceptional TE co-option capacity. From our study, ESRRB utilization of MIR gain-of-enhancer activity could be the mechanism which underpins the superlative Esrrb regulatory function in naïve vis-à-vis the expanded potential state.
The stage-specific enhancer control of Esrrb was also demonstrated in the 2C embryo versus the 2iESC stage [79]. Our in vitro data suggests a diminished significance of Esrrb regulation in the EPSC state, which we consider to be 8-cell-like state. A recent in vivo study found that Esrrb knockout embryo did not display detrimental effect at 8-cell stage and was able to progress beyond the 8-cell stage [80]. On the other hand, Esrrb is crucial for ESC self-renewal and pluripotency [50]. Taken together, these indicated that Esrrb initially plays a less essential or redundant role, but gradually gains more control across the higher to lower pluripotency spectrum.
In this study, we demonstrated that perturbation of a single MIR (Klf5-MIR) under ESRRB control is sufficient to affect the self-renewal property of ESC, but not EPSC. Klf5-MIR inhibition or KO effectively results in a drastic reduction of Klf5 gene expression. This showed the importance of a single MIR enhancer in determining gene expression. Klf5 is a known pro-proliferative factor, directly inducing the expression of cyclin genes [81]. The diminished proliferation capacity of Klf5-MIR KO clones can be explained by this.
Here, we also demonstrated that ESRRB is a crucial loop-mediating protein, loss of which impairs enhancer-associated loops and ESC-specific gene expression. ESRRB association with RNAPII and Mediator [82] complex primes it to act as a bridge for E-P interaction and its depletion reduces E-P looping [83]. Both RNAPII and Mediator have been reported to be crucial for E-P interaction [70, 71]. Moreover, ChIP-MS of E-P histone marks identified ESRRB as one of the candidate structuring factors [60]. All these suggest that ESRRB holds a pivotal role in establishing the 3D genomic landscape in mouse ESC, potentially in conjunction with TEs. We propose that ESRRB may act as a universal adaptor enabling other proteins to exert structural function. As an adaptor protein that connects a large network of TE-associated enhancers and SEs, ESRRB is uniquely gifted to regulate genome through the transcriptional condensate mode [84, 85, 86]. Such a possibility will warrant intensive further study.
In our study, we established ESRRB-YY1 partnership in regulating MIR in ESC. YY1 is a ubiquitously expressed protein and has been proposed to be a general E-P structuring protein, comparable to CTCF, in the case of boundary formation [60]. Given that YY1 motif is not enriched on MIR, it is possible that in the MIR context, ESRRB act as an adaptor for YY1. This supports the view that a general structural protein such as YY1 can be utilized in a versatile manner through partnership with different proteins at different cell states, depending on which TF is expressed and which TE is accessible in a certain cell state. We speculate that in other potency states, other TFs act as adaptor for YY1, and that this mix-and-match mechanism confers cell state specificity. Akin to SETDB1 [87], we propose YY1 as a topological accessory protein for ESRRB that can co-regulate ESC maintenance genes.
While this study provides insights into MIR regulation by ESRRB in ESCs, there are several limitations that must be considered. First, due to the extensive sequence divergence among MIR elements, our broad-targeting MIR sgRNAs designed based on MIR consensus sequence were only capable of targeting a minimal fraction (4.75%) of all MIR sequences. As a result, the effect of MIR inhibition may be obscured if the sgRNAs do not target the key MIRs with regulatory role in ESC. Additionally, some MIRs may perform other regulatory function, such as an insulator [44]. Targeting the non-enhancer MIRs with CRISPRi machinery may not yield any result. Therefore, a more targeted sgRNA design approach, exemplified by Klf5-MIR sgRNAs, offers a better approach in unraveling the function of individual MIRs. Secondly, the knockdown of a transcription factor will lead to changes in gene expression, which may or may not be caused by looping function. Moreover, Esrrb is one of the important pluripotency genes, the loss of which will cause change of cell state. To disentangle whether ESRRB-mediated looping is a direct contributor to changes in gene expression, the use of an acute degradation system is more beneficial.
Conclusions
Taken together, our study highlights the role of ESRRB-regulated MIR in naïve potency state. Esrrb is a known reprogramming factor [88], and the contribution of MIR in the reprogramming process is still unknown. Which multilayer epigenetic regulation of MIR may affect reprogramming and lead to cell fate transition [89, 90, 91, 92, 93] will warrant further investigation. Furthermore, it is also interesting to study the role of ESRRB-MIR interaction beyond the pluripotent spectrum. It is known that Esrrb knockout mouse displayed lethality due to failure of placenta development [94]. Given that MIR is a TE shared by all mammalian species, it is intuitive to explore whether the co-option of MIR by ESRRB may contribute to drive the evolution of placental mammals. Cross-species comparative analysis of TF regulon [95, 96, 97] can elucidate the evolutionary relationship.
Methods
Cell culture
E14 mouse embryonic stem cells (ESCs) were cultured in LIF-serum medium comprised of DMEM high glucose (Hyclone) supplemented with 15% ES cell FBS (Gibco), 2 mM L-glutamine (Gibco), 1X Pen-Strep (Gibco), 100 μM MEM non-essential amino acids (Gibco), 100 μM β-mercaptoethanol (Gibco), and 1000 U/mL leukemia inhibitory factor (LIF; ESGRO, Millipore). EPSCs were derived from ESC by culturing in EPSC media developed by [14], comprising of DMEM/F12 (Invitrogen), 20% KnockOut Serum Replacement (Invitrogen), 1 × GPS, 1 × non-essential amino acids, 0.1 mM β-ME, 103 U ml−1 hLIF, 3 μM CHIR99021 (Tocris), 1 μM PD0325901 (Tocris), 4 μM JNK Inhibitor VIII (Tocris), 10 μM SB203580 (Tocris), 0.3 μM A- 419259 (Santa Cruz), and 5 μM XAV939 (Sigma). For RA-differentiation experiment, 1 μM retinoic acid was added to the culture media for 48 h. All cultures were maintained at 37 °C with 5% CO2. All cells were maintained on plates pre-coated with 0.1% gelatine (porcine). Mycoplasma test was routinely performed using PCR detection kit (Genecopoiea).
Hi-C
Hi-C was performed according to Arima HiC + kit protocol (document part number: A160430 v00) with some modifications. Briefly, cells were crosslinked with 2% formaldehyde in PBS at room temperature for 10 min, quenched for 5 min and washed once with ice-cold PBS. Next, Hi-C reactions were performed using 15 µg of DNA from crosslinked cells according to Arima HiC + kit protocol. After Hi-C reactions, the chromatins were transferred into Covaris microTUBE AFA Fiber Pre-Slit Snap-Cap 6 × 16 mm tube and sheared using Covaris S2 instrument with intensity setting 3 (equivalent to 105 W), 5% duty cycle, and 200 cycles per burst for 7 min. Then, 10% of the sheared chromatins were aliquoted and decrosslinked for library preparation using Accel-NGS 2S Plus DNA Library Kit (Swift Biosciences, 21,024 and 26,148) according to the Arima HiC + kit protocol (document part number: A160169). The resulting library was size selected using SPRIselect beads (Beckman Coulter, B23318) to remove fragments larger than 1000 bp. The final library size was then profiled using Bioanalyzer High Sensitivity DNA Assay (Agilent Technologies, 5067–4646). The final libraries were sequenced in paired-end mode for 150 bp read length on Illumina HiSeq4000 platform to a sequencing depth of 900 million reads.
Hi-C data processing
The Hi-C reads were aligned to the mm9 genome using BWA with the runHiC pipeline using the default settings and the enzyme restriction sequence provided by Arima Genomics was entered (https://pypi.org/project/runHiC/). Valid pairs were retained and the output.cool file was used for downstream analysis. The cooler file was then used as input to the HiCPeaks (https://github.com/XiaoTaoWang/HiCPeaks) with HiCCUPS algorithm for loop detection at 5 kb resolution [98]. The HiC matrices were generated with 10 kb resolution and axis range ± 100 kb. Differential compartments were called by dcHiC using 100 kb resolution (https://github.com/ay-lab/dcHiC). The genes enriched in the differential compartments were defined by the overlap between the gene regions and differential compartments regions using bedtools. The Juicer “Arrowhead” function was used to identify TADs at a resolution of 10 kb resolution. Chromatin loops are visualized in the WashU genome browser. Pile-up analysis of the loops was performed by coolpup.pl using the.cool file (https://github.com/open2c/coolpuppy) [99].
ATAC-seq
ATAC-seq was performed following the protocol described in [100] using Illumina tagment DNA TDE1 Enzyme and Buffer kit (Cat No.: 20034197) and Nextera DNA library preparation kit (Cat No.: FC- 121–1030). In summary, 50 k cells were lysed with 50 µl lysis buffer (10 mM Tris*Cl (pH 7.4), 10 mM NaCl, 3 mM MgCl2, 0.1% (v/v) IGEPAL CA- 630) and spun down immediately at 500 g at 4 °C for 10 min. The supernatant was removed and the cell pellet was resuspended in 50 µl transposition reaction mix (25 µl TD, 2.5 µl TDE1, 22.5 µl nuclease-free water) for 30 min at 37 °C. Qiagen Minelute PCR purification kits were used to purify the 50 µl transposed DNA. Following the Nextera DNA library preparation kit protocol, purified DNA was prepared into sequencing libraries.
ATAC-seq data processing
Adapters were trimmed using Trim Galore. The reads were then mapped to the mouse mm9 genome using STAR with the following parameters: –alignIntronMax 1 –alignEndsType EndToEnd. Peak calling was performed by MACS2 with the following parameters: –nolambda –nomodel -q 0.01 –keep-dup 1. Deeptools (v3.3.0) was used to generate the bigwig files. By using computeMatrix followed by plotProfile, metaplots, and heatmaps were profiled.
Motifs analysis
Motif enrichment analysis was performed using Homer [101] with function findMotifsGenome.pl. The p-value of known motifs was used for the visualization. To check whether the MIR consensus sequences contain the ESRRB motif, FIMO (Find Individual Motif Occurrences) belonging MEME Suite was executed with the input of MIR consensus sequence downloaded from the Dfam database to scan the ESRRB motif and calculate the p-value.
HiChIP
HiChIP was performed according to Arima HiC + kit protocol (document part number: A160430 v00) with some modifications. Briefly, cells were crosslinked with 2% formaldehyde in PBS at room temperature for 10 min, quenched for 5 min and washed once with ice-cold PBS. Next, Hi-C reactions were performed as described in the Hi-C section above. Following shearing, the chromatins were precleared using Dynabeads™ Protein A (Thermo Fisher Scientific, 10001D) for 1 h at 4 °C. Subsequently, 3 µg of H3 K27ac antibody (ab4729, Abcam) was added to the precleared chromatins and incubated overnight at 4 °C. Following the overnight incubation, BSA-blocked Dynabeads™ Protein A were added to the antibody-bound chromatins and incubated for 4 h at 4 °C, washed, eluted, and decrosslinked according to Arima HiC + kit protocol. Eluted DNA concentration was measured using Qubit dsDNA HS (High Sensitivity) Assay Kit (Invitrogen). HiChIP library was prepared in the same way as the Hi-C library, described above. The final libraries were sequenced in paired-end mode for 150 bp read length on Illumina HiSeq4000 platform to a sequencing depth of 300 million reads for HiChIP.
HiChIP data processing
HiChIP data was processed using the Arima HiChIP pipeline which uses BWA for alignment. FitHiChIP was used for loop calling. OCT4 and Cohesin HiChIP [102] were aligned to the mm9 genome using the software HiC-pro [103]. Duplicate reads were discarded, and the remaining reads were assigned to MboI restriction fragments. Valid interactions were obtained and used for the next step analysis. Juicer tools [104] were used to generate and convert valid interaction pairs into the.hic file. The significant loops were identified using HiCCUPs [98] from the Juicer package, and the contact matrix was corrected using the Knight-Ruiz (KR) normalization method [105]. The HiChIP matrices were generated with a 10 kb resolution, while the HiChIP loop calling used a 5 kb resolution.
Chromatin immunoprecipitation (ChIP)
ChIP was performed according to the protocol previously described [15]. Cells were trypsinized, harvested, and the cell number was estimated. Crosslinking was performed using 1% formaldehyde for 10 min at room temperature followed by quenching with 0.25 M Glycine. Cross-linked pellets were washed twice with ice-cold 1X PBS (with 0.1% Triton X- 100) and then subjected to lysis with a lysis buffer (10 mM Tris–Cl (pH 8), 100 mM NaCl, 10 mM EDTA, 0.25% Triton X- 100 and protease inhibitor cocktail (Roche)). Pellets were resuspended after centrifugation, in 1% SDS lysis buffer (50 mM HEPES–KOH (pH 7.5), 150 mM NaCl, 1% SDS, 2 mM EDTA, 1% Triton X- 100, 0.1% NaDOC, and protease inhibitor cocktail). Following complete resuspension, the samples were nutated at 4 °C for 15 min and then pelleted by high-speed centrifugation. This was followed by two washes of all the lysed samples with 0.1% SDS lysis buffer (50 mM HEPES–KOH (pH 7.5), 150 mM NaCl, 0.1% SDS, 2 mM EDTA, 1% Triton X- 100, 0.1% NaDOC and protease inhibitor cocktail). For every 10 million cells, 14 cycles of sonication were carried out using the Bioruptor (Diagenode) with 30-s pulses and 60-s halts in each cycle. Cell debris was separated from the sheared chromatin by centrifugation at 15,000 rpm at 4 °C for 30 min. Pre-clearing of the chromatin was carried out with 100 µL of Protein G Dynabeads (Life Technologies) for 2 h at 4 °C. Simultaneously, 100 µL of Protein G Dynabeads were also bound to 5 µg (per 10 million cells) of the antibodies. Antibodies used were ESRRB antibody (PPH6705, Perseus Proteomics), H3 K27ac antibody (ab4729, Abcam), H3 K4me3 antibody (ab8580, Abcam), Cas9 antibody (C15310258, Diagenode), and YY1 antibody (ThermoFisher, PA5 - 29,171). After separation of a small amount as input—the remaining pre-cleared chromatin was used to bind to the antibody-bound beads at 4 °C, overnight. Elution involved three washes with 0.1% SDS lysis buffer, one with 0.1% SDS lysis buffer/0.35 M NaCl, one with 10 mM Tris–Cl (pH 8.0), 1 mM EDTA, 0.5% NP40, 0.25 LiCl, 0.5% NaDOC, and one with TE buffer (pH 8.0). The immunoprecipitated chromatin was eluted out from the beads by heating the beads resuspended in 50 mM Tris–HCl (pH 7.5), 10 mM EDTA, 1% SDS, for 1 h at 68 °C while shaking at 1400 rpm. Crosslinks were reversed by incubating the eluted samples and inputs at 42 °C for 2 h and 67 °C for 6 h in the presence of Pronase (Sigma) and TE buffer, after which the DNA was purified using the QIAGEN PCR Purification kit (for inputs) and the QIAGEN MinElute PCR Purification kit (for samples) as per the manufacturer’s instructions. Quantitative PCR was performed for the purified ChIP-DNA samples by using the QuantStudio 5 Real-Time PCR System (Thermo Fisher), using KAPA SYBR Green Master Mix (Roche). The data represented was normalized to the Inputs as well as the Negative control primers, thus representing an overall fold enrichment. The negative control primers were designed based on lack of binding of factors (gene desert regions). The primer sequences are provided in Table S6.
Following qPCR validation of the samples, libraries were prepared using Accel-NGS 2S Plus DNA Library Kit (Swift Biosciences, 21,024 and 26,148) as per the manufacturer’s instructions. The quality and concentration of each sample was assessed by using Agilent High Sensitivity chips on the Agilent 2100 Bioanalyzer. High-throughput sequencing for the samples was performed on Illumina platform.
ChIP-seq data processing
FastQC (v0.11.4) was used to check the sequence quality. Reads were mapped to the mm9 genome using STAR software (v2.7.1) [106] with the parameters “–alignIntronMax 1 –alignEndsType EndToEnd.” Peaks were called using MACS2 [107] with parameters “—no-model –q 0.05 –keep-dup 1.” Bigwig files and average profile were produced by Deeptools [108]. ROSE (https://bitbucket.org/young_computation/rose) was used to call super-enhancers based on H3 K27ac ChIP-seq data as described in [33, 109]. Basically, we ran ROSE with a stitching distance of 12,500 bp and a TSS exclusion zone size of 2500. H3 K27ac enhancer peaks within 12.5 kb of each other were stitched together and ranked according to H3 K27ac enrichment. To distinguish super-enhancers from typical enhancers, we chose a line with slope 1 that is tangential to the ranked H3 K27ac enrichment curve.
Enrichment of TF ChIP-seq binding peaks in TEs
The extent of ESRRB binding peaks derived from TE was evaluated using enrichment calculations [28]. To identify TE-derived TF binding peaks, we required that TF overlap TE by at least one base pair. TE mouse (mm9) assembly was downloaded from the UCSC Genome Browser. The intersection was calculated using the intersectBed tool (with default parameters) from the BEDTools package [110]. This formula from [28] was used to define the enrichment of TF binding peaks in TEs. The threshold of 2 was used to identify TE families enriched for TF binding peaks.
TE family enrichment analysis
TEs annotated by RepeatMasker from the classes of SINE, LINE, LTR, and DNA were used for enrichment analysis. TE family elements in the genome less than 1000, satellites, and simple repeats were excluded from analysis. Only the major TE families (Alu, MIR, B2, B4, L1, L2, ERV1, ERVK, ERVL, MaLR, MER1_type, MER2_type, and ID) were used for the final result visualization. We compared the accessible regions, enhancers, compartments, TADs, and chromatin loop anchors with different TE families, to investigate whether these genomic features were enriched or depleted of any specific types of transposable elements. We analyzed the TE enrichment in open regions or enhancers by this following equation:
To examine the TE enrichment in Hi-C loop anchors, this following equation was used. Using similar formulas as loop anchors, TE enrichment fold change is calculated in TADs and compartments.
Most enhancers were located within TADs and Hi-C chromatin loops. For identifying TEs associated with genome architecture-related enhancers, we defined loop domains as loop-spanning regions in which loop anchors intersected TAD boundaries or loops were located within TADs. An analysis of the TE enrichment in a state-specific Hi-C loop domain was conducted by first identifying the differential loops between pair of cell states, then identifying the enhancers within those loop domains. This formula was used to calculate the fold change in TE enrichment:
Significance p-value was computed using the chi-squared test, and p-value was used for the visualization. Only significantly enriched TE family with p-value < 0.01, fold change (FC) of enrichment > 1, and number of involved specific loops > 100 was shown.
RNA-seq
Total RNA was extracted for each of the transfected cells using Trizol reagent (Ambion). DNA contamination for the samples was minimized by using the QIAGEN RNeasy Kit. The RNA samples were processed using a TruSeq Stranded mRNA Library Prep Kit (RS- 122–2101, Illumina). This kit was used for mRNA selection, fragmentation, cDNA synthesis, and library preparation. The library quality was analyzed on an Agilent Bioanalyzer using the Agilent DNA 1000 kit. High-throughput sequencing was then performed on a HiSeq4000 instrument.
RNA-seq data processing
For RNA-seq analysis, reads were mapped to the mm9 genome using STAR [106] with the default parameters. GTF annotation file of genes (mm9) was downloaded from the UCSC genome browser (https://genome.ucsc.edu/cgi-bin/hgTables). Cufflinks (v2.2.1) [111] were used for gene assembly and gene quantification to generate fragments per kilobase per million mapped reads (FPKM) table. Count numbers of transcripts were computed by Htseq software (v0.12.4). DESeq2 [112] or EdgeR [113] was used to perform differential gene analysis. The cutoff p-value was set as 0.05, and the cutoff of fold change was set as 1.5. Python package gseapy [114] was used for gene set enrichment analysis (GSEA) [115] with parameters: “gseapy prerank -f pdf –max-size 100,000 –r a.rnk -g b.gmt –o result.” Metascape [116] (http://metascape.org) was used for Gene Ontology analysis.
Interaction networks
Protein–protein interactions were derived from STRING database [117], and network was visualized using Cytoscape [118].
shRNA design
Dharmacon’s siDESIGN center (http://dharmacon.gelifesciences.com/design-center/) was used for the design of the shRNA sequences. shRNA sequences were ordered as DNA oligos from IDT and cloned into the pSUPER.puro plasmid.
Transfection of shRNA
Cells were cultured at 37 °C for 12–14 h prior to transfection, and then transfected with 2.5 μg of the plasmid constructs using 5 μL of Lipofectamine 3000 (Thermofisher), as per the manufacturer’s protocol. Puromycin was added to the cell media to a final concentration of 1 μg/mL on the next day. The cells were cultured with puromycin-containing medium for 72 h before harvest.
Quantitative PCR
For cDNA samples to be quantitated by qPCR, RNA was converted to cDNA by using the 5X iScript Reverse Transcriptase mix (BioRad). RT-qPCR was performed according to KAPA SYBR Green Master Mix (Roche) user’s protocol and ran on QuantStudio 5 Real-Time PCR System (Thermo Fisher). Actin-beta was used as control primers to normalize gene expression. The primer sequences were provided in Table S6.
sgRNA design
For sgRNA targeting MIR in a genome-wide manner, consensus sequences of all MIR subfamilies (MIR, MIRb, MIRc, and MIR3) were obtained from Dfam.org and imported into the CRISPR sgRNA designer CRISPOR. We used Bowtie to identify MIR positions that the sgRNA targeted with a maximum of 3 mismatches. Then, we selected the top sgRNAs which could target the most MIR sequences across all MIR subfamilies (sgRNA22 and sgRNA23). We only took into account the MIR sequences which ended with NGG for each sgRNA. We used Bowtie to map the sgRNA to the mouse genome (mm9) to identify the regions potentially targeted by the sgRNA. We checked for off-target sequences and ensured that the selected sgRNAs are not targeting evolutionarily close transposon classes using Cas-offinder, using 2 bp mismatches as a threshold. (http://www.rgenome.net/cas-offinder/). After applying these selection criteria, the predicted number of MIRs that can be targeted by sgRNA22 and MIR sgRNA23 are 1871 and 1051, respectively. The number of shared targets between the 2 sgRNAs is 694.
For sgRNA targeting Klf5-MIR site, the Klf5-MIR sequence was obtained from UCSC Genome Browser and imported into the CRISPOR. The outputs were checked for their off-target scores. sgRNA sequences were added with overhang sequence and ordered as DNA oligos from IDT and cloned into pCas9-Guide-Puro-CRISPRi-Vector for CRISPRi or into pSLQ1651-sgRNA(F + E)-sgGal4 for CAPTURE2.0-MS. The sgRNA sequences were provided in Table S6.
CAPTURE2.0 cell line generation
To enable locus-specific protein capture, we utilized a CRISPR-Cas9-based system (CAPTURE2.0) previously described [67]. Briefly, CAPTURE2_pLVX-EF1a-dCas9-CBio-IRES-zsGreen1 was packaged into lentiviral. Mouse E14 ESCs were transduced with the virus and the top 2% of zsGreen1-expressing cells were sorted by FACS. The sorted cells were cultured in single colonies and expanded as clonal lines. Individual line was screened for dCas9 transcript expression by RT-PCR and dCas9 protein expression by immunofluorescence and Western blot. Custom sgRNAs were cloned into pSLQ1651-sgRNA(F + E)-sgGal4 as described and packaged into lentiviral vector, then transduced to a selected clonal line (clone 1B) and sorted for the top 2% of mCherry expression. The resulting culture was maintained and expanded.
CAPTURE2.0-MS
The procedure was adapted from previous publication [119]. Briefly, cells were crosslinked with 1% formaldehyde for 10 min at room temperature, quenched with 250 mM glycine for 5 min, and washed twice with ice-cold PBS. As described previously, 50 million cells in a 2-ml tube were lysed with 2 ml cell lysis buffer (25 mM Tris–HCl, 85 mM KCl, 0.1% Triton X- 100, pH 7.4, freshly added 1 mM DTT, 1 mM phenylmethylsulfonyl fluoride (PMSF), and 1:200 protease inhibitor cocktail), rotated at 4 °C for 30 min, followed by centrifugation at 2500 g at 4 °C for 5 min. Cell pellet was then resuspended in 500 µl cell lysis buffer and treated with 0.5 µg RNase A at 37 °C for 30 min with rotation, followed by centrifugation at 2500 g at 4 °C for 5 min. The nuclei were then lysed four times by resuspension in 400 µl nuclear lysis buffer (50 mM Tris–HCl, 10 mM EDTA, 4% SDS, pH7.4, freshly added 1 mM DTT, 1 mM PMSF, and 1:200 protease inhibitor cocktail) for 10 min at room temperature, added with 1.2 mL of 8 M urea buffer, and centrifuged at 16,100 g for 25 min at room temperature. To washout urea, the nuclei were then washed twice by resuspension in 1.6 ml of nuclear lysis buffer, followed by centrifugation at 16,100 g for 25 min at room temperature. To washout SDS, the nuclei were then washed twice by resuspension in 1.6 ml of cell lysis buffer, followed by centrifugation at 16,100 g for 25 min at room temperature. The pellet was then resuspended in 300 µl of IP binding buffer without NaCl (20 mM Tris–HCL, 1 mM EDTA, 0.1% NP- 40, 10% glycerol, pH7.5) and sonicated to a size range of 300–1000 bp, with enrichment at ~ 500 bp using Bioruptor Plus (30 s on, 90 s off, 4 cycles, low power mode).
Sheared chromatin was then centrifuged at 16,100 g at 4 °C for 25 min. A final concentration of 150 mM NaCl was added to the collected supernatant. Dynabeads™ MyOne™ Streptavidin C1 (Invitrogen, 65,001; 100 µl per sample) were washed twice, 5 min each, in IP binding buffer (20 mM Tris–HCl, 1 mM EDTA, 150 mM NaCl, 0.1% NP- 40, 10% glycerol, pH7.4). Washed beads were resuspended in 50 µl IP binding buffer per sample and added into soluble chromatin. The reaction volume was topped up to 800 µl and incubated overnight at 4 °C with rotation. After the incubation, the beads were magnetically separated from the supernatant and washed 5 times by resuspension in IP binding buffer, 5 min each at 4 °C with rotation. To washout NP- 40, the beads were washed 3 times by resuspension in IP binding buffer without NP- 40 (20 mM Tris–HCl, 1 mM EDTA, 150 mM NaCl, 10% glycerol, pH 7.4), 3 min each at 4 °C with rotation. The samples were eluted from the magnetic beads by heating at 95 °C for 30 min each in 1:1 dilution of 2 × decrosslinking buffer (0.1% Tween- 20 in 20 mM citrate buffer, pH 6) and 2 × elution buffer (10% SDS in 100 mM triethylammonium bicarbonate buffer, pH 8.5).
Eluted proteins were then digested on S-Trap micro columns (ProtiFi), dried in vacuum centrifuge, redissolved in 100 mM TEAB, and labelled with TMTsixplex reagents (Thermo Fisher Scientific) according to the manufacturers’ instructions. Labelled samples are pooled and analyzed by LC–MS/MS as previously described [120] with minor changes. Briefly, peptides were separated and analyzed using an ACQUITY UPLC Peptide BEH C18 column (SKU: 186,007,483) in a nanoACQUITY UPLC system (Waters) coupled to an Orbitrap Eclipse Tribrid MS (Thermo Fisher Scientific), and mass spectra were subsequently analyzed by Sequest HT search engine against UniProt Mus musculus reference proteome (UP000000589) using Proteome Discoverer v 2.4 (Thermo Fisher Scientific) with static modifications of TMT6plex (K and peptide N-term) and carbamidomethylation (C), dynamic modification of oxidation (M), and no normalization applied to labelled peptide abundances.
Immunofluorescence staining
Samples were fixed with 4% PFA in PBS for 20 min at room temperature, washed, and permeabilized with 0.2% Triton X- 100 in PBS for 15 min. Samples were then blocked with blocking buffer (PBS containing 3% BSA and 0.1% Tween 20) at room temperature for 2 h or overnight at 4 °C. Primary antibodies diluted in blocking buffer were added to the samples and incubated overnight at 4 °C. Samples were washed for three times with PBS containing 0.1% Tween 20 followed by the incubation with fluorescence conjugated secondary antibodies diluted in blocking buffer for 2 h at room temperature. Samples were washed for three times with PBS containing 0.1% Tween 20. Nuclei were counterstained with DAPI at 1 µg/mL or Hoechst 33342 (Thermo Fisher). The primary antibodies and dilutions used were as follows: Mouse anti-CDX2 (1:100; BioGeneX, MU392 A- 5UC), Goat anti-GATA6 (1:200; R and D Systems, af1700), Rabbit anti-OCT4 (1:100; Abcam, ab19857), Rabbit anti-Cas9 (5 µl, Diagenode, C15310258). The secondary antibodies and dilutions used were as follows: Alexa Fluor 488 Donkey anti-Mouse IgG (H + L) (1:1000; Invitrogen, A32766), Alexa Fluor 647 Donkey anti-Rabbit IgG (H + L) (1:1000; Invitrogen, A32795), Alexa Fluor 555 Donkey anti-Rabbit IgG (H + L) (1:1000; Invitrogen, A31572), Alexa Fluor 488 Donkey anti-Goat IgG (H + L) (1:1000; Invitrogen, A11055). Confocal microscope was used for capturing the images.
Western blot
Cells were lysed by using lysis buffer (containing protease inhibitor cocktail and PMSF) and the lysate was separated at 12,000 g, 20 min. As per the requirement, the concentrations of the protein samples were estimated by Bradford assay. The proteins in the lysate are denatured by boiling for 10 min with 2X Laemmli buffer. The protein samples were loaded onto an SDS–PAGE gel and transferred onto a nitrocellulose membrane (BioRad). The membrane was blocked with 5% milk at room temperature for 1 h, followed by incubation with primary antibodies for overnight at 4 °C. The secondary horseradish peroxidase (HRP)-conjugated anti-mouse IgG, HRP-conjugated anti-rabbit IgG or HRP-conjugated anti-goat IgG (1:10,000) antibodies were then added to the membrane at room temperature for 1 h. For signal detection, we used the SuperSignal West Dura Extended Duration Substrate (Thermo Scientific) and image the blot using iBright (Invitrogen) or ChemiDoc (BioRad) imager.
siRNA transfection
Cells were seeded 1 day prior to transfection. siGENOME Mouse SMARTpool siRNA (Dharmacon) was transfected to the cells at 500 nM final concentration using DharmaFECT1 according to the reagent’s protocol.
Blastoid formation assay
Blastoid formation was based on established protocol [121], which was performed on 6-well AggreWell 400 plates (STEMCELL Technologies, 34,425). Following the manufacturer’s procedures, AggreWells were prepared using Anti-Adherence Rinsing Solution (STEMCELL Technologies, 07010). EPSC colonies were dissociated into single cells by incubation with Accutase. Approximate 35,000 cells (5 cells per microwell for 7000 microwells) were seeded. Blastoid seeding medium was used to culture the cells for the first 24 h (day 0 to day 1). Blastoid seeding medium is composed of 25% TSC basal medium, 25% N2B27 basal medium (Takara Bio, Y40002), 50% KSOM (Millipore, MR- 020P- 5D), 12.5 ng/mL rhFGF4 (R&D Systems 235-F4 - 025), 0.5 µg/mL heparin, 3 μM CHIR99021 and 2 μM Y- 27632 (Reagents Direct, 53-B80 - 50). The cells were then incubated at 37 °C with 5% CO2. Blastoid seeding medium was replaced with complete blastoid medium 24 h after cell seeding. Complete blastoid medium is composed of 25% TSC basal medium, 25% N2B27 basal medium, 50% KSOM, 12.5 ng/mL rhFGF4, 0.5 µg/mL heparin, 3 μM CHIR9902, 5 ng/mL BMP4 (Proteintech, HZ- 1040) and 0.5 μM A83 - 01 (Axon Medchem, 1421). On day 5, blastoids were manually picked with a Cook Flexipet pipette under a stereomicroscope for analysis.
Sequential chromatin immunoprecipitation (Sequential ChIP)
Equimolar ratio of Protein A and Protein G Dynabeads (Invitrogen) were combined in 50 μl per sample. The beads were washed three times with PBS with 0.1% Triton X- 100 and then resuspended in 600 μl of Pre-Adsorption buffer (equal amounts of ChIP lysis buffer and ChIP dilution buffer (1% Triton X- 100, 2 mM EDTA, 20 mM Tris–HCl, 150 mM NaCl, 1X Protease inhibitor cocktail). BSA was added to the resuspended beads to block non-specific binding at a final concentration of 200 μg/ml, followed by overnight incubation at 4 °C. The beads were subsequently washed with ChIP dilution buffer. The first antibody was added to 100 μl of the bead mixture, which was incubated at room temperature for 3 h. Sonicated E14 DNA (see Chromatin Immunoprecipitation method) was pre-cleared with 100 μl of the beads for 3 h at 4 °C. After pre-clearing, 50 μl of the chromatin was isolated as input while the rest was added to the antibody-bound beads for overnight incubation at 4 °C. For the first elution, the chromatin-antibody-bound beads were washed three times with 0.1% SDS lysis buffer, once with 0.1% SDS lysis buffer/0.35 M NaCl, once with 10 mM Tris–Cl (pH 8), 1 mM EDTA, 0.5% NP40, 0.25 LiCl, 0.5% NaDOC, and once with TE buffer (pH 8.0). The beads were collected by centrifugation at 800 g for 1 min and were resuspended in 75 μl of elution buffer (50 mM Tris–HCl (pH 7.5), 10 mM EDTA, 1% SDS) at 37 °C for 30 min. The eluted DNA was separated from the beads by brief centrifugation at 1000 g for 2 min. Fifteen microliters of the eluted DNA was used for validation of the first ChIP, whereas the remaining 60 μl was diluted 20 times with ChIP dilution buffer to 1200 μl total volume. The second antibody was added to incubate with 100 μl freshly pre-adsorbed Protein A and G Dynabeads at room temperature for 3 h. To maintain a 1:19 Input: IP ratio, 63 μl of the first eluted sample was separated. The remaining 12-μl sample was added to incubate with the antibody-bound beads overnight at 4 °C. DNA elution for the second ChIP follows the same procedure as the first ChIP, which was then incubated at 68 °C for 60 min while shaking at 1400 rpm. Subsequently, the samples were de-crosslinked, purified, and used for ChIP qPCR. The primer sequences are provided in Table S6.
CRISPR inactivation (CRISPRi)
MIR sequence was obtained from UCSC Genome Browser and imported into the CRISPOR. The outputs were checked for their off-target scores. sgRNA sequences were added with an overhang sequence and ordered as DNA oligos from IDT and cloned into pCas9-Guide-Puro-CRISPRi-Vector. CRISPRi vector was obtained from ORIGENE (GE100083) and used according to the manufacturer’s protocol, with minor modifications. One thousand-nanogram vector plasmid was digested with 16U BamHI-HF (NEB) and 8U BsmBI-v2 (NEB) enzymes in buffer r3.1 (NEB) for 3 h at 37 °C. The digested plasmid was then treated with calf intestinal alkaline phosphatase (NEB) for 30 min at 37 °C and subsequently purified with a QIAquick PCR purification kit (QIAGEN). Forward and reverse sgRNAs at 10 µM final concentration were annealed in T4 ligation buffer (NEB) and 5U T4 PNK (NEB). The annealing reaction was first incubated for 30 min at 37 °C, then heated to 95 °C and gradually cooled down to 25 °C at the rate of 5 °C/min. The sgRNA duplex was diluted 500 times and ligated into the digested plasmid with 400U QuickLigase (NEB) for 2.5 h at 25 °C. The ligated plasmids were transformed into Stbl3 competent E. coli. The insert was verified by Sanger sequencing. Plasmids were then transfected to the cells using 7.5 µl Lipofectamine 3000 (Invitrogen) per 5 µg DNA. CRISPRi vector with scramble sgRNA (GE100084) was used as control.
CRISPR knockout
A total of 3 sgRNAs were designed, targeting the start, end, and within Klf5-MIR (chr14:99,739,099–99739310, mm9 genome assembly). These sgRNAs were cloned into PX459 (pSpCas9(BB)− 2A-Puro (PX459) V2.0 (Addgene plasmid #62,988)) following the published protocol. Three sgRNAs were simultaneously transfected into ESCs. Twenty-four hours after transfection, puromycin selection was applied for 1 day. Afterward, a portion of the cells were seeded sparsely into a 10-cm dish, while the remaining were used for genomic DNA extraction using DNeasy blood and tissue kit (Qiagen). The genomic DNA is analyzed with T7 Endonuclease I (NEB) assay, and colonies that displayed high efficiency on T7E1 assay were manually picked and cultured on 96-well plates. Genomic DNA is harvested from the cells according to the published protocol (https://mcmanuslab.ucsf.edu/protocol/dna-isolation-es-cells-96-well-plate). For genotyping, PCR was performed using primer pair flanking Klf5-MIR that produced 420-bp amplicon in wildtype DNA (forward sequence 5′-GGTTAGATGGCATTGACTTT- 3′ and reverse sequence 5′-CCTCTGTAAAGTTGCAATCC- 3′). PCR products with the largest deletion were further analyzed by cloning using a TOPO blunt-end cloning kit (Invitrogen) and Sanger sequencing.
MTS assay
For assessment of cell proliferation, 2500 cells were seeded into each well of a 96-well plate in triplicate. Cells were allowed to proliferate for 72 h in 100 µl media, after which 20 µl CellTiter 96® AQueous One Solution Cell Proliferation Assay (Promega) was added into each well and incubated at 37 °C for 1–2 h. Absorbance at 490 nm was measured using a plate reader (Tecan).
Data availability
Datasets generated in this study are deposited in GEO database (GSE215205 [122] and GSE215453 [123]) and ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD062330 [124]. Public ChIP-seq datasets reanalyzed in this study were HDAC1 and CHD4 (GSE27841), SMARCA4 (GSE90893), SIN3A (GSE117287), MBD3 (GSE39610), MTA2 (GSE112113), EP300 (GSE90893), MED1 and MED12 (GSE22562), NRF1 (GSE67867), SOX2 and POU5F1 (GSE74112), EP400 (GSE67580), BRD2 (GSE67944), NELFA (GSE20530), NANOG (GSE55404), RAD21 (GSE33346), CTCF (GSE28247), CDK8 (GSE60027), POLR2F (GSE28247), KLF5 (GSE49848), and YY1 (GSE99518). Public ChIA-PET dataset reanalyzed in this study was YY1 (GSE99520). The software used for the analysis was mentioned in the “Methods” section. The scripts of the main steps can be found in the Github (https://github.com/zengyingying2015/MIR_Esrrb/tree/main) [125]. Scripts were also uploaded to https://doiorg.publicaciones.saludcastillayleon.es/10.5281/zenodo.14405913 [126]. MIT license is assigned to the Github and Zenodo scripts.
References
Stadhouders R, Filion G, Graf T. Transcription factors and 3D genome conformation in cell-fate decisions. Nature. 2019;569(7756):345–54.
Hug C, Vaquerizas J. The birth of the 3D genome during early embryonic development. Trends Genet. 2018;34(12):903–14.
Lim P, Meshorer E. Organization of the pluripotent genome. Cold Spring Harb Perspect Biol. 2021;13(2):a040204.
Gorkin D, Leung D, Ren B. The 3D genome in transcriptional regulation and pluripotency. Cell Stem Cell. 2014;14(6):762–75.
Denholtz M, Plath K. Pluripotency in 3D: genome organization in pluripotent cells. Curr Opin Cell Biol. 2012;24(6):793–801.
Ke Y, Xu Y, Chen X, Feng S, Liu Z, Sun Y, et al. 3D chromatin structures of mature gametes and structural reprogramming during mammalian embryogenesis. Cell. 2017;170(2):367–81.
Du Z, Zheng H, Huang B, Ma R, Wu J, Zhang X, et al. Allelic reprogramming of 3D chromatin architecture during early mammalian development. Nature. 2017;547(7662):232–5.
Flyamer I, Gassler J, Imakaev M, Brandao H, Ulianov S, Abdennur N, et al. Single-nucleus Hi-C reveals unique chromatin reorganization at oocyte-to-zygote transition. Nature. 2017;544(7648):110–4.
Collombet S, Ranisavljevic N, Nagano T, Varnai C, Shisode T, Leung W, et al. Parental-to-embryo switch of chromosome organization in early embryogenesis. Nature. 2020;580(7801):142–6.
Gassler J, Brandão H, Imakaev M, Flyamer I, Ladstätter S, Bickmore W, et al. A mechanism of cohesin-dependent loop extrusion organizes zygotic genome architecture. EMBO J. 2017;36(24):3600–18.
Evans M, Kaufman M. Establishment in culture of pluripotential cells from mouse embryos. Nature. 1981;292(5819):154–6.
De Los AA. Frontiers of pluripotency. Methods Mol Biol. 2005;2019:3–27.
Hackett J, Surani M. Regulatory principles of pluripotency: from the ground state up. Cell Stem Cell. 2014;15(4):416–30.
Yang J, Ryan D, Wang W, Tsang J, Lan G, Masaki H, et al. Establishment of mouse expanded potential stem cells. Nature. 2017;550(7676):393–7.
Loh Y, Wu Q, Chew J, Vega V, Zhang W, Chen X, et al. The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells. Nat Genet. 2006;38(4):431–40.
Zhou Q, Chipperfield H, Melton D, Wong W. A gene regulatory network in mouse embryonic stem cells. Proc Natl Acad Sci U S A. 2007;104(42):16438.
Loh Y, Yang L, Yang J, Li H, Collins J, Daley G. Genomic approaches to deconstruct pluripotency. Annu Rev Genomics Hum Genet. 2011;12:165–85.
Joshi O, Wang S, Kuznetsova T, Atlasi Y, Peng T, Fabre P, et al. Dynamic reorganization of extremely long-range promoter-promoter interactions between two states of P. Cell Stem Cell. 2015;17(6):748–57.
Guo R, Dong X, Chen F, Ji T, He Q, Zhang J, et al. TEAD2 initiates ground-state pluripotency by mediating chromatin looping. EMBO J. 2024;43(10):1965–89.
Zhu Y, Yu J, Gu J, Xue C, Zhang L, Chen J, et al. Relaxed 3D genome conformation facilitates the pluripotent to totipotent-like state transition in embryonic stem cells. Nucleic Acids Res. 2021;49(21):12167–77.
Fueyo R, Judd J, Feshotte C, Wysocka J. Roles of transposable elements in the regulation of mammalian transcription. Nat Rev Mol Cell Biol. 2022;23(7):481–97.
Glaser J, Mundlos S. 3D or not 3D: shaping the genome during development. Cold Spring Harb Perspect Biol. 2022;14(5):a040188.
Feschotte C. Transposable elements and the evolution of regulatory networks. Nat Rev Genet. 2008;9(5):397–405.
Sundaram V, Wysocka J. Transposable elements as a potent source of diverse cis-regulatory sequences in mammalian genomes. Philos Trans R Soc Lond B Biol Sci. 2020;375(1795):20190347.
He J, Fu X, Zhang M, He F, Li W, Abdul M, et al. Transposable elements are regulated by context-specific patterns of chromatin marks in mouse embryonic stem cells. Nat Commun. 2019;10(1):34.
Yang B, El Farran C, Guo H, Yu T, Fang H, Wang H, et al. Systematic identification of factors for provirus silencing in embryonic stem cells. Cell. 2015;163(1):230–45.
Barakat T, Halbritter F, Zhang M, Rendeiro A, Perenthaler E, Bock C, et al. Functional dissection of the enhancer repertoire in human embryonic stem cells. Cell Stem Cell. 2018;23(2):276-288.e8.
Sundaram V, Cheng Y, Ma Z, Li D, Xing X, Edge P, et al. Widespread contribution of transposable elements to the innovation of gene regulatory networks. Genome Res. 2014;24(12):1963–76.
Glinsky G. Transposable elements and DNA methylation create in embryonic stem cells human-specific regulatory sequences associated with distal enhancers and noncoding RNAs. Genome Biol Evol. 2015;7(6):1432–54.
Sundaram V, Choudhary M, Pehrsson E, Xing X, Fiore C, Pandey M, et al. Functional cis-regulatory modules encoded by mouse-specific endogenous retrovirus. Nat Commun. 2017;8:14550.
Todd C, Deniz O, Taylor D, Branco M. Functional evaluation of transposable elements as enhancers in mouse embryonic and trophoblast stem cells. Elife. 2019;8:e44344.
Spitz F, Furlong E. Transcription factors: from enhancer binding to developmental control. Nat Rev Genet. 2012;13(9):613–26.
Whyte W, Orlando D, Hnisz D, Abraham B, Lin C, Kagey M, et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell. 2013;153(2):307–19.
Maurya S. Role of enhancers in development and diseases. Epigenomes. 2021;5(4):21.
Sun T, Xu Y, Xiang Y, Ou J, Soderblom E, Diao Y. Crosstalk between RNA m6A and DNA methylation regulates transposable element chromatin activation and cell fate in human pluripotent stem cells. Nat Genet. 2023;55(8):1324–35.
Vuong L, Pan S, Donovan P. Proteome profile of endogenous retrotransposon-associated complexes in human embryonic stem cells. Proteomics. 2019;19(15):e1900169.
Percharde M, Lin C, Yin Y, Guan J, Peixoto G, Bulut-Karslioglu A, et al. A LINE1-nucleolin partnership regulates early development and ESC identity. Cell. 2018;174(2):391-405.e19.
Kim S, Shendure J. Mechanisms of interplay between transcription factors and the 3D genome. Mol Cell. 2019;76(2):306–19.
Smit A, Riggs A. MIRs are classic, tRNA-derived SINEs that amplified before the mammalian radiation. Nucleic Acids Res. 1995;23(1):98–102.
Rohrmoser M, Kluge M, Yahia Y, Gruber-Eber A, Maqbool M, Forne I, et al. MIR sequences recruit zinc finger protein ZNF768 to expressed genes. Nucleic Acids Res. 2019;47(2):700–15.
Zeng Y, Cao Y, Halevy R, Nguyen P, Liu D, Zhang X, et al. Characterization of functional transposable element enhancers in acute myeloid leukemia. Sci China Life Sci. 2020;63(5):675–87.
Jjingo D, Conley A, Wang J, Marino-Ramirez L, Lunyak V, Jordan I. Mammalian-wide interspersed repeat (MIR)-derived enhancers and the regulation of human gene expression. Mob DNA. 2014;5:14.
Smith A, Sanchez M, Follows G, Kinston S, Donaldson J, Green A, et al. A novel mode of enhancer evolution: the Tal1 stem cell enhancer recruited a MIR element to specifically boost its activity. Genome Res. 2008;18(9):1422–32.
Wang J, Vicente-Garcia C, Seruggia D, Molto E, Fernandez-Minan A, Neto A, et al. MIR retrotransposon sequences provide insulators to the human genome. Proc Natl Acad Sci U S A. 2015;112(32):E4428–37.
Cao Y, Chen G, Wu G, Zhang X, McDermott J, Chen X, et al. Widespread roles of enhancer-like transposable elements in cell identity and long-range genomic interactions. Genome Res. 2019;29(1):40–52.
Silva J, Shabalina S, Harris D, Spouge J, Kondrashovi A. Conserved fragments of transposable elements in intergenic regions: evidence for widespread recruitment of MIR- and L2-derived sequences within the mouse and human genomes. Genet Res. 2003;82(1):1–18.
Carnevali D, Conti A, Pellegrini M, Dieci G. Whole-genome expression analysis of mammalian-wide interspersed repeat elements in human cell lines. DNA Res. 2017;24(1):59–69.
Jjingo D, Huda A, Gundapuneni M, Marino-Ramirez L, Jordan I. Effect of the transposable element environment of human genes on gene length and expression. Genome Biol Evol. 2011;3:259–71.
Hermant C, Torres-Padilla M. TFs for TEs: the transcription factor repertoire of mammalian transposable elements. Genes Dev. 2021;35(1–2):22–39.
Zhang X, Zhang J, Wang T, Esteban M, Pei D. Esrrb activates Oct4 transcription and sustains self-renewal and pluripotency in embryonic stem cells. J Biol Chem. 2008;283(51):35825–33.
van den Berg D, Zhang W, Yates A, Engelen E, Takacs K, Bezstarosti K, et al. Estrogen-related receptor beta interacts with Oct4 to positively regulate Nanog gene expression. Mol Cell Biol. 2008;28(19):5986–95.
Martello G, Sugimoto T, Diamanti E, Joshi A, Hannah R, Ohtsuka S, et al. Esrrb is a pivotal target of the Gsk3/Tcf3 axis regulating embryonic stem cell self-renewal. Cell Stem Cell. 2012;11(4):491–504.
Festuccia N, Osorno R, Halbritter F, Karwacki-Neisius V, Navarro P, Colby D, et al. Esrrb is a direct Nanog target gene that can substitute for Nanog function in pluripotent cells. Cell Stem Cell. 2012;11(4):4.
Festuccia N, Owens N, Navarro P. Esrrb, an estrogen-related receptor involved in early development, pluripotency, and reprogramming. FEBS Lett. 2018;592(6):852–77.
Rossello R, Pfenning A, Howard J, Hochgeschwender U. Characterization and genetic manipulation of primed stem cells into a functional naïve state with ESRRB. World J Stem Cells. 2016;8(10):355–66.
Festuccia N, Halbritter F, Corsinotti A, Gagliardi A, Colby D, Tomlinson S, et al. Esrrb extinction triggers dismantling of naïve pluripotency and marks commitment to differentiation. EMBO J. 2018;37(21):e95476.
Adachi K, Kopp W, Wu G, Heising S, Greber B, Stehling M, et al. Esrrb unlocks silenced enhancers for reprogramming to naive pluripotency. Cell Stem Cell. 2018;23(6):900–4.
Atlasi Y, Megchelenbrink W, Peng T, Habibi E, Joshi O, Wang S, et al. Epigenetic modulation of a hardwired 3D chromatin landscape in two naive states of pluripotency. Nat Cell Biol. 2019;21(5):568–78.
Festuccia N, Owens N, Chervova A, Dubois A, Navarro P. The combined action of Esrrb and Nr5a2 is essential for murine naïve pluripotency. Development. 2021;148(17):dev199604.
Weintraub A, Li C, Zamudio A, Sigova A, Hannett N, Day D, et al. YY1 is a structural regulator of enhancer-promoter loops. Cell. 2017;171(7):1573-1588.e28.
Yang J, Ryan D, Lan G, Zou X, Liu P. In vitro establishment of expanded-potential stem cells from mouse pre-implantation embryos or embryonic stem cells. Nat Protoc. 2019;14(2):350–78.
Zhang H, Khoa L, Mao F, Xu H, Zhou B, Han Y, et al. MLL1 inhibition and vitamin D signaling cooperate to facilitate the expanded pluripotency state. Cell Rep. 2019;29(9):2659-2671.e6.
Wong K, Zeng Y, Tay E, Teo J, Cipta N, Hamashima K, et al. Nuclear receptor-SINE B1 network modulates expanded pluripotency in blastoids and blastocysts. Nat Commun. 2024;15(1):10011.
Creswell K, Dozmorov M. TADCompare: an R package for differential and temporal analysis of topologically associated domains. Front Genet. 2020;11:158.
Lu J, Chang L, Li T, Wang T, Yin Y, Zhan G, et al. Homotypic clustering of L1 and B1/Alu repeats compartmentalizes the 3D genome. Cell Res. 2021;31(6):613–30.
Gilbert NLD. CORE-SINEs: eukaryotic short interspersed retroposing elements with common sequence motifs. Proc Natl Acad Sci U S A. 1999;96(6):2869–74.
Liu X, Chen Y, Zhang Y, Liu Y, Liu N, Botten G, et al. Multiplexed capture of spatial configuration and temporal dynamics of locus-specific 3D chromatin by biotinylated dCas9. Genome Biol. 2020;21(1):59.
Soprano D, Teets B, Soprano K. Role of retinoic acid in the differentiation of embryonal carcinoma and embryonic stem cells. Vitam Horm. 2007;75:69–95.
Yeo N, Chaves A, Lance-Byrne A, Chan Y, Menn D, Milanova D, et al. An enhanced CRISPR repressor for targeted mammalian gene regulation. Nat Methods. 2018;15(8):611–6.
Zhang S, Ubelmesser N, Barbieri M, Papantonis A. Enhancer-promoter contact formation requires RNAPII and antagonizes loop extrusion. Nat Genet. 2023;55(5):832–40.
Kagey M, Newman J, Bilodeau S, Zhan Y, Orlando D, van Berkum N, et al. Mediator and cohesin connect gene expression and chromatin architecture. Nature. 2010;467(7314):430–5.
Xu Y, Zhao J, Ren Y, Wang X, Lyu Y, Xie B, et al. Derivation of totipotent-like stem cells with blastocyst-like structure forming potential. Cell Res. 2022;32(6):513–29.
Wang X, Xiang Y, Yu Y, Wang R, Zhang Y, Xu Q, et al. Formative pluripotent stem cells show features of epiblast cells poised for gastrulation. Cell Res. 2021;31(5):526–41.
Kinoshita M, Barber M, Mansfield W, Cui Y, Spindlow D, Stirparo G, et al. Capture of mouse and human stem cells with features of formative pluripotency. Cell Stem Cell. 2021;28(3):453-471.e8.
Neagu A, van Genderen E, Escudero I, Verwegen L, Kurek D, Lehmann J, et al. In vitro capture and characterization of embryonic rosette-stage pluripotency between naive and primed states. Nat Cell Biol. 2020;22(5):534–45.
Yi Y, Zeng Y, Sam T, Hamashima K, Tan R, Warrier T, et al. Ribosomal proteins regulate 2-cell-stage transcriptome in mouse embryonic stem cells. Stem Cell Reports. 2023;18(2):463–74.
Faucon P, Pardee K, Kumar R, Li H, Loh Y, Wang X. Gene networks of fully connected triads with complete auto-activation enable multistability and stepwise stochastic transitions. PLoS ONE. 2014;9(7):e102873.
Bell E, Curry E, Megchelenbrink W, Jouneau L, Brochard V, Tomaz R, Mau K, Atlasi Y, de Souza R, Marks H, Stunnenberg H, Jouneau A, Azuara V. Dynamic CpG methylation delineates subregions within super-enhancers selectively decommissioned at the exit from naive pluripotency. Nat Commun. 2020;11(1):1112.
Gassler J, Kobayashi W, Gaspar I, Ruangroengkulrith S, Mohanan A, Gomez Hernandez L, et al. Zygotic genome activation by the totipotency pioneer factor Nr5a2. Science. 2022;378(6626):1305–15.
Festuccia N, Vandoemael-Pournin S, Chervova A, Geiselmann A, Langa-Vives F, Coux R, et al. Nr5a2 is dispensable for zygotic genome activation but essential for morula development. Science. 2024;386(6717):eadg7325.
Ghaleb A, Nandan M, Chanchevalap S, Dalton W, Hisamuddin I, Yang V. Krüppel-like factors 4 and 5: the yin and yang regulators of cellular proliferation. Cell Res. 2005;15(2):92–6.
van den Berg D, Snoek T, Mullin N, Yates A, Bezstarosti K, Demmers J, et al. An Oct4-centered protein interaction network in embryonic stem cells. Cell Stem Cell. 2010;6(4):369–81.
Sun F, Chronis C, Kronenberg M, Chen X, Su T, Lay F, et al. Promoter-enhancer communication occurs primarily within insulated neighborhoods. Mol Cell. 2019;73(2):250-263.e5.
Asimi V, Sampath Kumar A, Niskanen H, Riemenschenider C, Hetzel S, Naderi J, et al. Hijacking of transcriptional condensates by endogenous retroviruses. Nat Genet. 2022;54(8):1238–47.
Cho K, Spille J, Hecht M, Lee C, Li C, Grube V, et al. Mediator and RNA polymerase II clusters associate in transcription-dependent condensates. Science. 2018;361(6400):412–5.
Sabari B, Dall’Agnese A, Boija A, Klein I, Coffey E, Shrinivas K, et al. Coactivator condensation at super-enhancers links phase separation and gene control. Science. 2018;361(6400):eaar3958.
Warrier T, El Farran C, Zeng Y, Ho B, Bao Q, Zheng Z, et al. SETDB1 acts as a topological accessory to Cohesin via an H3K9me3-independent, genomic shunt for regulating cell fates. Nucleic Acid Res. 2022;50(13):7326–49.
Feng B, Jiang J, Kraus P, Ng J, Heng J, Chan Y, et al. Reprogramming of fibroblasts into induced pluripotent stem cells with orphan nuclear receptor Esrrb. Nat Cell Biol. 2009;11(2):197–203.
Fang H, El Farran C, Zing Q, Zhang L, Li H, Lim B, et al. Global H3.3 dynamic deposition defines its bimodal role in cell fate transition. Nat Commun. 2018;9(1):1537.
Wen B, Wu H, Loh Y, Briem E, Daley G, Feinberg A. Euchromatin islands in large heterochromatin domains are enriched for CTCF binding and differentially DNA-methylated regions. BMC Genomics. 2012;13:566.
Maury J, El Farran C, Ng D, Loh Y, Bi X, Bardor M, et al. RING1B O-GlcNAcylation regulates gene targeting of polycomb repressive complex 1 in human embryonic stem cells. Stem Cell Res. 2015;15(1):182–9.
Toh C, Chan J, Chong Z, Wang H, Guo H, Satapathy S, et al. RNAi reveals phase-specific global regulators of human somatic cell reprogramming. Cell Rep. 2016;15(12):2597–607.
Hamashima K, Wong K, Sam T, Teo J, Taneja R, Le M, et al. Single-nucleus multiomic mapping of m6A methylomes and transcriptomes in native populations of cells with sn-m6A-CT. Mol Cell. 2023;S1097–2765(23):00649–54.
Mitsunaga K, Araki K, Mizusaki H, Morohashi K, Haruna K, Nakagata N, et al. Loss of PGC-specific expression of the orphan nuclear receptor ERR-beta results in reduction of germ cell number in mouse embryos. Mech Dev. 2004;121(3):237–46.
Gautam P, Hamashima K, Chen Y, Zeng Y, Makovoz B, Parikh B, et al. Multi-species single-cell transcriptomic analysis of ocular compartment regulons. Nat Commun. 2021;12(1):5675.
Wang J, Sun H, Jiang M, Li J, Zhang P, Chen H, et al. Tracing cell-type evolution by cross-species comparison of cell atlases. Cell Rep. 2021;34(9):108803.
Geirsdottir L, David E, Keren-Shaul H, Weiner A, Bohlen S, Neuber J, et al. Cross-species single-cell analysis reveals divergence of the primate microglia program. Cell. 2019;179(7):1609-1622.e16.
Rao S, Huntley M, Durand N, Stamenova E, Bochkov I, Robinson J, et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014;159(7):1665–80.
Abdennur N, Mirny L. Cooler: scalable storage for Hi-C data and other genomically labeled arrays. Bioinformatics. 2020;36(1):311–6.
Buenrostro J, Giresi P, Zaba L, Chang H, Greenleaf W. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat Methods. 2013;10(12):1213.
Heinz S, Benner C, Spann N, Bertolino E, Lin Y, Laslo P, et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 2010;38(4):576–89.
Mumbach M, Rubin A, Flynn R, Dai C, Khavari P, Greenleaf W, et al. HiChIP: efficient and sensitive analysis of protein-directed genome architecture. Nat Methods. 2016;13(11):919–22.
Ji X, Dadon D, Powell B, Fan Z, Borges-Rivera D, Shachar S, et al. 3D chromosome regulatory landscape of human pluripotent cells. Cell Stem Cell. 2016;18(2):262–75.
Durand N, Shamim M, Machol I, Rao S, Huntley M, Lander E, et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 2016;3(1):95–8.
Knight P, Ruiz D. A fast algorithm for matrix balancing. IMA J Numer Anal. 2013;33(3):1029–47.
Dobin A, Davis C, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21.
Zhang Y, Liu T, Meyer C, Eeckhoute J, Johnson D, Bernstein B, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008;9(9):R137.
Ramirez F, Ryan D, Gruning B, Bhardwaj V, Kilpert F, Richter A, et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44(W1):W160–5.
Loven J, Hoke H, Lin C, Lau A, Orlando D, Vakoc C, et al. Selective inhibition of tumor oncogenes by disruption of super-enhancers. Cell. 2013;153(2):320–34.
Quinlan A, Hall I. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2.
Trapnell C, Williams B, Pertea G, Mortazavi A, Kwan G, van Baren M, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28(5):511–5.
Love M, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):1.
Robinson M, McCarthy D, Smyth G. edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40.
Kuleshov M, Jones M, Rouillard A, Fernandez N, Duan Q, Wang Z, et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acid Res. 2016;44(W1):W90–7.
Subramanian A, Tamayo P, Mootha V, Mukherjee S, Ebert B, Gillette M, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102(43):15545–50.
Zhou Y, Zhou B, Pache L, Chang M, Khodabakhshi A, Tanaseichuk O, et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun. 2019;10(1):1523.
Szklarczyk D, Morris J, Cook H, Kuhn M, Wyder S, Simonovic M, et al. The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acid Res. 2017;45(D1):D362–8.
Cline M, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, et al. Integration of biological networks and gene expression data using Cytoscape. Nat Protoc. 2007;2(10):2366–82.
Liu X, Zhang Y, Chen Y, Li M, Shao Z, Zhang M, et al. CAPTURE: in situ analysis of chromatin composition of endogenous genomic loci by biotinylated dCas9. Curr Protoc Mol Biol. 2018;123(1):e64.
Lee A, Kok Y, Lakshmanan M, Leong D, Zheng L, Lim H, et al. Multi-omics profiling of a CHO cell culture system unravels the effect of culture pH on cell growth. Biotechnol Bioeng. 2021;118(11):4305–16.
Li R, Zhong C, Yu Y, Liu H, Sakurai M, Yu L, et al. Generation of blastocyst-like structures from mouse embryonic and adult cell cultures. Cell. 2019;179(3):687–702.
Zheng Z, Zeng Y, Wong K, Cipta N, Tan J, Tay E, et al. Dmrt1 regulates mouse expanded potency and facilitates the breaking of embryonic lineage barrier. GSE215205. GEO. 2025. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE215205.
Cipta N, Zeng Y, Wong K, Zheng Z, Yi Y, Warrier T, et al. SINE-MIR enhancer rewiring by Esrrb architecturally reprogram the pluripotency spectrum. GSE215453. GEO. 2025. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE215453.
Cipta N, Zeng Y, Wong K, Zheng Z, Yi Y, Warrier T, et al. CAPTURE2.0 pull-down of SINE-MIR associated proteins in mouse embryonic stem cells. PXD062330. PRIDE. 2025. https://proteomecentral.proteomexchange.org/cgi/GetDataset?ID=PXD062330.
Cipta N, Zeng Y, Wong K, Zheng Z, Yi Y, Warrier T, et al. Rewiring of SINE-MIR enhancer topology and Esrrb modulation in expanded and naive pluripotency. Github. 2024. https://github.com/zengyingying2015/MIR_Esrrb.
Cipta N, Zeng Y, Wong K, Zheng Z, Yi Y, Warrier T, et al. Rewiring of SINE-MIR enhancer topology and Esrrb modulation in expanded and naive pluripotency. Zenodo. 2024. https://doiorg.publicaciones.saludcastillayleon.es/10.5281/zenodo.14405913.
Review history
The review history is available as Additional File 9.
Peer review information
Veronique van den Berghe was the primary editor of this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Funding
H.L. was supported by grants from the Mayo Clinic Center for Biomedical Discovery, Center for Individualized Medicine, the Mayo Clinic Comprehensive Cancer Center (NIH; P30 CA015083), the Mayo Clinic Center for Cell Signaling in Gastroenterology (NIH: P30DK084567), the Glenn Foundation for Medical Research, the Mayo Clinic Nutrition Obesity Research Program, the David F. and Margaret T. Grohne Cancer Immunology and Immunotherapy Program, the Eric & Wendy Schmidt Fund for AI Research & Innovation and the National Institutes of Health (NIH; U19 AG74879, P50 CA136393, R03OD038392). Y-H.L. was supported by the National Research Foundation, Singapore (NRF) Investigatorship award [NRFI2018 - 02]; National Medical Research Council [NMRC/OFIRG21nov- 0088]; Singapore Food Story (SFS) R&D Programme [W22 W3D0007]; A*STAR Biomedical Research Council, Central Research Fund, Use-Inspired Basic Research (CRF UIBR); Competitive Research Programme (CRP) [NRF-CRP29 - 2022–0005]; Industry Alignment Fund—Prepositioning (IAF-PP) [H23 J2a0095, H23 J2a0097]. Conflict of interest statement. None declared.
Author information
Authors and Affiliations
Contributions
N.O.C. performed experiments and wrote the paper. Y.Y.Z. performed the bioinformatics analyses and wrote the paper; Z.H.Z., Y.Y., T.W., T.J.Z., K.W.W., and J.H.J.T. assisted with the data acquisition and manuscript writing; Y.J.K and X.B. assisted with the data acquisition. L.Y.C., R.T., D. O., F.G., and H.L. analyzed the data, and Y.H.L. designed the study, analyzed the data, and wrote the paper. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
13059_2025_3577_MOESM1_ESM.pdf
Additional file 1: Supplementary Fig. S1-S6. Fig. S1. Characterization of EPSC and quality control of Hi-C data. Fig. S2. TE family enrichment analysis in H3 K27ac+Hi-C loop anchors and H3 K27ac HiChIP loop anchors. Fig. S3. Motif enrichment analysis in TE-enhancers of ESC and characterization of CAPTURE2.0 system to target MIRs. Fig. S4. Effect of Esrrb knockdown on enhancer-gene looping and ESRRB interaction with YY1. Fig. S5. CRISPR inactivation of MIR and generation of Klf5-MIR CRISPR KO line. Fig. S6. Yy1 and Esrrb knockdown experiment and sequential ChIP
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Cipta, N.O., Zeng, Y., Wong, K.W. et al. Rewiring of SINE-MIR enhancer topology and Esrrb modulation in expanded and naive pluripotency. Genome Biol 26, 107 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13059-025-03577-8
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13059-025-03577-8