- Brand-new dataset collected with experts from Southern California & Missouri
- 1.5 trillion base pairs from diverse wastewater samples
- Short reads (100–300 BPs), deep sequencing at scale
- Byte-Pair Encoding customized for genomic sequences
🧵3/
- Brand-new dataset collected with experts from Southern California & Missouri
- 1.5 trillion base pairs from diverse wastewater samples
- Short reads (100–300 BPs), deep sequencing at scale
- Byte-Pair Encoding customized for genomic sequences
🧵3/