Ollie Liu
banner
oliu-io.bsky.social
Ollie Liu
@oliu-io.bsky.social
https://ollieliu.com/; oliver irl
phd'ing in ml@usc; prev. ml@cmu, msr
multimodal foundation models, ai4sci, decision making
Thanks to my amazing collaborators: @samsja19.bsky.social , Johannes Hagemann, @shangshang-wang.bsky.social , Jason Wiemels, Jeff Kaufman, and @willieneis.bsky.social
Special shout out to the Nucleic Acid Observatory for the sequencing data, and @PrimeIntellect for compute support.
January 6, 2025 at 5:04 PM
We’re sharing METAGENE-1’s:
📄Paper: metagene.ai/metagene-1-p...
🌐Website: metagene.ai
🤗Model weights: huggingface.co/metagene-ai
🧵7/
January 6, 2025 at 5:04 PM
🛡Tailored for detection, not design. We scoped METAGENE-1 to minimize risks while maximizing potential for public health and biosurveillance. Responsible open-sourcing matters. With open weights, we aim to drive progress in interpretability and safe genomics research.
🧵6/
January 6, 2025 at 5:04 PM
📈METAGENE-1 achieves state-of-the-art results in:
- Pathogen detection
- Genomic embedding benchmarks
- Generalization to multi-species tasks
It already shows promise in public health and biosurveillance, and we are collaborating with experts to unlock its full impact.
🧵5/
January 6, 2025 at 5:04 PM
The METAGENE-1 model is 7B parameter Llama-style transformer 🦙, pretrained and optimized for anomaly detection, embedding, and multi-species genomics. Fully compatible with 🤗Hugging Face (huggingface.co/metagene-ai) – ready to use like any of your favorite LLMs!
🧵4/
January 6, 2025 at 5:04 PM
📊The data behind METAGENE-1:
- Brand-new dataset collected with experts from Southern California & Missouri
- 1.5 trillion base pairs from diverse wastewater samples
- Short reads (100–300 BPs), deep sequencing at scale
- Byte-Pair Encoding customized for genomic sequences
🧵3/
January 6, 2025 at 5:04 PM
Why is METAGENE-1 special? 🤔We trained it on wastewater metagenomics, capturing the human-adjacent microbiome across the US for the past 12 months. This unlocks powerful capabilities for early pathogen detection and microbial ecosystems understanding. 🌱🦠
🌐Website: metagene.ai
🧵2/
January 6, 2025 at 5:04 PM
👋 nlp@usc student. thanks!
November 25, 2024 at 3:56 AM
yes please if there's still space left :-P
November 18, 2024 at 5:40 AM
our border collie pup doodle absolutely wants nothing from that plate of banana :-P
November 18, 2024 at 4:24 AM