Skip to main content
Fig. 3 | Genome Biology

Fig. 3

From: scCDC: a computational method for gene-specific contamination detection and correction in single-cell and single-nucleus RNA-seq data

Fig. 3

Contamination detection by scCDC in snRNA-seq and scRNA-seq datasets. A, B The plots show the counts vs rank in empty droplets in the snRNA-seq datasets in mammary glands. The top ranked 200 and 500 genes are shown. Selected GCGs are highlighted and labeled on the plot. The dash line separated the “super-contaminating genes” and the other genes (details in Method Appendix. See Additional file 1). C, D Distribution of GCGs is significantly deviated from negative binomial (NB) distribution. Box plots of the p-values of NB distribution goodness-of-fit test of GCGs and housekeeping genes. E scCDC identifies highly contaminating genes in the “barnyard” scRNA-seq dataset of mixed human 293 T cells and mouse 3T3 cells. The scatter plots show the average counts vs. ranks of cross-species contaminating genes in the indicated cells. The average cross-species contaminative counts of GCG and non-GCGs are shown in the boxplots on the right. F Box plots show the contamination ratios of GCGs in the indicated datasets

Back to article page