Fig. 2
From: TEMPTED: time-informed dimensionality reduction for longitudinal microbiome studies

TEMPTED outperforms CTF, microTensor, TCAM, FTSVD, and PCoA in identifying group structures. TEMPTED demonstrates superior performance in reducing host-level phenotype discriminatory error by more than 50% compared to methods capable of handling missing time points (a, c) and improves the sample-level group discriminatory power (b). Sample-level discriminatory power is quantified by PERMANOVA pseudo F-statistic based on the sample-level Euclidean distance constructed using the first two components from each method (b) (for TEMPTED see Eq. (2)). Host-level group classification error is quantified by AUC-PR (1 - area under the precision-recall curve) with the first two components of each method as predictors. Both logistic regression and random forest classifiers were employed and shown in a, and the results from the better of the two classifiers were shown in c. Dimension reduction is performed in-sample and out-of-sample (see the “Methods” section) respectively, and group labels are predicted using leave-one-out for logistic regression and out-of-bag for random forest. The methods were applied to two datasets: the ECAM infant fecal microbiome data (a, b), which distinguishes between infants delivered vaginally (N-subject = 23) and by cesarean section (N-subject = 17), and the FARMM dataset (c), which distinguishes between EEN diet (N-subjects = 10) and vegetarian or omnivore diet (N-subject = 20). Error bars represent 1.96 standard errors. For ECAM-based simulation, we randomly choose a given number of samples from each subject such that CTF, microTensor, FTSVD, and TCAM can use the order of the infant age as time variable to form a tensor with no missing values, while TEMPTED uses the infant age as is. For FARMM-based simulation, we randomly drop samples from 15 time points to achieve different percent of missingness, which CTF and microTensor can manage but TCAM and FTSVD cannot. EMBED was not included in the benchmarking because it does not provide host-level or sample-level beta diversity analysis. Different reads per sample are obtained by resampling reads in each sample