Skip to main content
Fig. 2 | Genome Biology

Fig. 2

From: Overlooked poor-quality patient samples in sequencing data impair reproducibility of published clinically relevant datasets

Fig. 2

QI and differential genes in data subsets. For three large datasets (panels) in our study, we randomly sampled several smaller subsets (points) of 20 samples each to compare their number of differential genes to their respective QI indices in equally sized and sourced datasets (see the “Methods” section for details). This simulation allows us to isolate and observe the effect of a quality imbalance on the number of differential genes from the effect of any other confounding factors such as particular patient characteristics (e.g., age, gender, or ethnic group) or dataset size (number of samples). For each dataset, we can observe a positive correlation between the quality imbalance and the number of differential genes of its subsets. Gray areas indicate confidence intervals

Back to article page