Fig. 2
From: PRESCOTT: a population aware, epistatic, and structural model accurately predicts missense effects

Analysis of the human MLH1 protein with PRESCOTT. The analysis focuses on the 500–756 region of the MLH1 protein, which includes 20 gnomAD missense mutations labeled in ClinVar. ClinVar lists 48 missense mutations in this MLH1 region: 16 benign/likely benign and 32 pathogenic/likely pathogenic. A ESCOTT matrix showing all possible substitutions, from any amino acid position in the sequence to any other amino acid. B ESCOTT matrix (A) masked to highlight scores for the 20 gnomAD missense mutations. C gnomAD v.4.0.0 matrix displaying maximum allele frequencies for the 20 mutations across eight populations. Low frequency mutations (cyan tones) are circled. The color scale (pink to cyan) represents high to low frequency. D PRESCOTT matrix for the 20 gnomAD mutations in B, C. Black squares highlight 9 mutations with scores differing from ESCOTT. Below: score changes between ESCOTT (left colored square) and PRESCOTT (right colored square) are represented by colors. See panel F for numerical scores. E ESCOTT scores (black circles) for the 48 ClinVar missense mutations. Only 45 points are visible due to 3 positions with double mutation: L555P and L555R (Likely pathogenic, ESCOTT score 0.88), L653P and L653R (Likely pathogenic, 0.87), and R659L and R659P (Pathogenic, 0.75). The 20 gnomAD mutations in B–D are included. PRESCOTT scores (white circles) are lower than ESCOTT scores, as indicated by arrows. Compare to Fig. 7B, 7C. F ESCOTT and PRESCOTT scores for the 9 gnomAD mutations in B–D with changes due to high allele frequency. Colors correspond to the scale in the A, B, and D panels, shown bottom left