Fig. 2

QI and differential genes in data subsets. For three large datasets (panels) in our study, we randomly sampled several smaller subsets (points) of 20 samples each to compare their number of differential genes to their respective QI indices in equally sized and sourced datasets (see the “Methods” section for details). This simulation allows us to isolate and observe the effect of a quality imbalance on the number of differential genes from the effect of any other confounding factors such as particular patient characteristics (e.g., age, gender, or ethnic group) or dataset size (number of samples). For each dataset, we can observe a positive correlation between the quality imbalance and the number of differential genes of its subsets. Gray areas indicate confidence intervals