Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: Massively integrated coexpression analysis reveals transcriptional regulation, evolution and cellular implications of the yeast noncanonical translatome

Fig. 1

Overview of coexpression inference framework and properties of the dataset. A Workflow: 3916 samples were analyzed to create an expression matrix for 11,630 ORFs, including 5803 cORFs and 5827 nORFs; center log ratio transformed (clr) expression values were used to calculate the coexpression matrix using proportionality metric, ρ, followed by normalization to correct for expression bias. The coexpression matrix was thresholded using ρ > 0.888 to create a coexpression network (top 0.2% of all pairs). Created with BioRender.com. B Distribution of the number of ORFs binned based on their median expression values (transcript per million—TPM) and the number of samples the ORFs were detected in with at least 5 raw counts. C Coexpressed cORF pairs (ρ > 0.888) are more likely to encode proteins that form complexes than non-coexpressed cORF pairs (Fisher’s exact test p < 2.2e−16; error bars: standard error of the proportion); using annotated protein complexes from ref. [67]. D Coexpressed ORF pairs (ρ > 0.888) are more likely to have their promoters bound by a common transcription factor (TF) than non-coexpressed ORF pairs (Fisher’s exact test p < 2.2e−16; error bars: standard error of the proportion); genome-wide TF binding profiles from ref. [68] and transcription start sites (TSS) from ref. [69] were analyzed to define promoter binding (see “Methods”). E Hierarchical clustering of the coexpression matrix reveals functional enrichments for most clusters that contain at least 5 cORFs; functional enrichments estimated by gene ontology (GO) enrichment analysis at false discovery rate (FDR) < 0.05 using Fisher’s exact test. F Coexpression is informative for predicting the inclusion of cORFs in biological processes via a neighbor-voting scheme; 116 out of 117 GO slim biological process (GO BP) terms had a mean area under the receiver operating characteristic (AUROC) greater than 0.5 across 3-fold cross-validation. Dashed vertical line represents null expectation at 0.5

Back to article page