From: Reproducible biomedical benchmarking in the cloud: lessons from crowd-sourced data challenges
Challenge | Data types | Data cohorts | N samples | Size | Open |
---|---|---|---|---|---|
Digital Mammography | Human clinical Imaging | Kaiser Permanente | 80k patients (640k images) | 13 TB | No |
MSSM | 1k (15k) | .3 TB | No | ||
Karolinska | 69k (663k) | 13.2 TB | No | ||
UCSF | 42k (500k) | 10 TB | No | ||
CRUK | 7 k |  | No | ||
Total | 200k (1818k) | 36.5 TB |  | ||
Multiple Myeloma | Human clinical; gene expr; DNAseq; Cytogenetics | MMRF | 797 | 11 GB | Yes |
PUBLIC | 1444 | 1 GB | Yes | ||
DFCI | 294 | 76 GB | No | ||
UAMS | 463 | 6 GB | No | ||
M2Gen | 105 | 41 GB | No | ||
Total | 3103 | 135 GB |  | ||
SMC-Het |  | All | 76 | 22 GB | No |
SMC-RNA | Simulated; Human clinical; RNA-seq | Training | 31 | 290 GB | Yes |
Test | 20 | 197 GB | Yes | ||
Real | 32 | 265 GB | No |