Skip to main content

Table 1 Datasets for evaluating all methods

From: TDFPS-Designer: an efficient toolkit for barcode design and selection in nanopore sequencing

Dataset

Sample

Type of barcode kit

Size of barcode kit

Barcode length (bp)

Number of reads

S-ET_ONT12

ETEC

ONT_EXP-NBD104

12

24

12,000

S-ET_ONT24

STEC

ONT_SQK-16S024

24

24

24,000

S-ET_ONT96

ETEC, STEC, and HS

ONT_EXP-PBC096

96

24

96,000

M-ESH_TD795

ETEC, STEC, and HS

Initially designed by TDFPS-Designer

795

20

795,000

M-ESH_TD1093

ETEC, STEC, and HS

Initially designed by TDFPS-Designer

1093

24

1,093,000

M-ESH_TD2120

ETEC, STEC, and HS

Initially designed by TDFPS-Designer

2120

30

2,120,000

L-ESH_TD137

ETEC, STEC, and HS

Finally designed by TDFPS-Designer

137

20

691,850

L-ESH_TD410

ETEC, STEC, and HS

Finally designed by TDFPS-Designer

410

24

2,070,500

L-ESH_TD1779

ETEC, STEC, and HS

Finally designed by TDFPS-Designer

1779

30

8,983,950

  1. Here, the full name of ETEC is Enterotoxigenic Escherichia coli, the full name of STEC is Shiga toxin-producing Escherichia coli, and the full name of HC is Historical Shigella. For “ONT_EXP-NBD104,” “ONT” implies that it is designed by ONT, and “EXP-NBD10” is its kit name. “ONT_EXP-NBD104” and “ONT_SQK-16S024” imply the same meaning as “ONT_EXP-NBD104.” Specifically, L-ESH_TD137, L-ESH_TD410, and L-ESH_TD1779 include 1% negative samples that were not successfully barcoded. These negative samples serve as a “noise class”