Willie Neiswanger
banner
willieneis.bsky.social
Willie Neiswanger
@willieneis.bsky.social
Assistant Professor in CS + AI at USC. Previously at Stanford, CMU. Machine Learning, Decision Making, AI-for-Science, Generative AI, ML Systems, LLMs.

https://willieneis.github.io
February 12, 2025 at 8:10 AM
Our paper also contains an in-depth discussion on safety when releasing metagenomic models.

Looking for collaborators to build on this with us — please reach out!

metagene.ai
January 7, 2025 at 8:58 PM
We leverage the ecosystem of modern LLM tooling—in tokenization, model architecture, training, infra, etc—for performance and extensibility. METAGENE-1 is standardized & easy to use.

Hugging Face: huggingface.co/metagene-ai
Github: github.com/metagene-ai
January 7, 2025 at 8:58 PM
​​METAGENE-1 shows state-of-the-art results on pathogen detection, metagenomic embedding, and other genomic tasks.

We also release new benchmarks for genomic detection and embedding (eg, Gene-MTEB, based on MTEB for LLMs).

See our paper for details: arxiv.org/abs/2501.02045
January 7, 2025 at 8:58 PM
Our data pipeline is: human microbiome > wastewater > metagenomic sequences > tokens > training data.

Wastewater provides a rich source of data from tens of thousands of species across the human-adjacent microbiome. In total we pretrain on over 1.5T base pairs of DNA/RNA.
January 7, 2025 at 8:58 PM
Metagenomic sequencing of wastewater produces vast amounts of data that can capture public health trends at a societal scale. Our goal is to train a model on this data to help in large-scale wastewater monitoring & detection of novel bio threats.
January 7, 2025 at 8:58 PM
Added!
December 9, 2024 at 8:40 AM
December 7, 2024 at 5:12 AM
December 6, 2024 at 10:36 AM
December 3, 2024 at 10:20 PM
Added!
December 2, 2024 at 5:46 PM
December 2, 2024 at 1:15 AM
Added!
December 2, 2024 at 12:35 AM
December 1, 2024 at 11:13 AM
Added!
December 1, 2024 at 11:12 AM
Added!
November 30, 2024 at 9:39 AM
Added!
November 30, 2024 at 6:05 AM
November 28, 2024 at 9:41 PM