phd'ing in ml@usc; prev. ml@cmu, msr
multimodal foundation models, ai4sci, decision making
Special shout out to the Nucleic Acid Observatory for the sequencing data, and @PrimeIntellect for compute support.
Special shout out to the Nucleic Acid Observatory for the sequencing data, and @PrimeIntellect for compute support.
📄Paper: metagene.ai/metagene-1-p...
🌐Website: metagene.ai
🤗Model weights: huggingface.co/metagene-ai
🧵7/
📄Paper: metagene.ai/metagene-1-p...
🌐Website: metagene.ai
🤗Model weights: huggingface.co/metagene-ai
🧵7/
🧵6/
🧵6/
- Pathogen detection
- Genomic embedding benchmarks
- Generalization to multi-species tasks
It already shows promise in public health and biosurveillance, and we are collaborating with experts to unlock its full impact.
🧵5/
- Pathogen detection
- Genomic embedding benchmarks
- Generalization to multi-species tasks
It already shows promise in public health and biosurveillance, and we are collaborating with experts to unlock its full impact.
🧵5/
🧵4/
🧵4/
- Brand-new dataset collected with experts from Southern California & Missouri
- 1.5 trillion base pairs from diverse wastewater samples
- Short reads (100–300 BPs), deep sequencing at scale
- Byte-Pair Encoding customized for genomic sequences
🧵3/
- Brand-new dataset collected with experts from Southern California & Missouri
- 1.5 trillion base pairs from diverse wastewater samples
- Short reads (100–300 BPs), deep sequencing at scale
- Byte-Pair Encoding customized for genomic sequences
🧵3/
🌐Website: metagene.ai
🧵2/
🌐Website: metagene.ai
🧵2/