Skip to main content

Table 2 Summary of features considered in models trained to predict mRNA half-life, with a description of the features considered, feature type (i.e., sequence or biochemical), data source, and number of features in the category. Sequence features were calculated identically for both human and mouse. Biochemical features were calculated only with respect to human data

From: The genetic and biochemical determinants of mRNA degradation rates in mammals

Feature description

Feature type

Data source

# features

Basic mRNA features such as length and G/C content of 5′ UTR, ORF, and 3′ UTR; intron length; ORF exon junction density

Sequence-derived

Custom Perl scripts

8

Codon frequencies (excluding stop codons)

Sequence-derived

Custom R scripts

61

1- to 7-mer frequencies in the 5′ UTR

Sequence-derived

Custom R scripts

21,844

1- to 7-mer frequencies in the ORF

Sequence-derived

Custom R scripts

21,844

1- to 7-mer frequencies in the 3′ UTR

Sequence-derived

Custom R scripts

21,844

Target predictions for mammalian microRNAs

Sequence-derived

TargetScan predictions [24]

319

Predicted average binding score of human RBPs (in each of 5′ UTR, ORF, and 3′ UTR)

Sequence-derived

DeepRiPe predictions [68]

177

Predicted average binding score of human and mouse RBPs (in each of 5′ UTR, ORF, and 3′ UTR)

Sequence-derived

SeqWeaver predictions [69]

780

Number of CLIP peaks for various RBPs

Biochemical

ENCORI database [70]

133

Number of eCLIP peaks for various RBPs (K562, HepG2, and adrenal gland)

Biochemical

ENCORE database [71]

225

Number of PAR-CLIP peaks

Biochemical

[72]

146

Number of CLIP peaks for m6A pathway components (a6A, m6Am, YTHDF2, METTL3, METTL14, and WTAP)

Biochemical

[37, 47, 73,74,75,76,77]

13

RIP-seq of diverse RBPs and translational efficiency

Biochemical

[36, 78,79,80]

34