Kexin Huang
kexinhuang.bsky.social
Kexin Huang
@kexinhuang.bsky.social
PhD student at Stanford CS; AI+Biomedicine
kexinhuang.com
With amazing collaborators
Serena Zhang Hanchen Wang Yuanhao Qu Minta Lu, PhD Yusuf Roohani Ryan Li Lin Qiu Gavin L. Junze Di Shruti Jennefer Xin Zhou Matthew Wheeler Jon Bernstein Mengdi Wang Peng He Michael Snyder Le Cong Aviv Regev Jure Leskovec
May 29, 2025 at 5:26 PM
🌍Biomni is an open-source initiative: we invite the community to build on it and advance biomedical research at scale.

🧪 Try it now: biomni.stanford.edu
📄 Paper: biomni.stanford.edu/paper.pdf
💻 Code: github.com/snap-stanfor... (will be fully open-sourced very soon!)
Biomni - A General-Purpose Biomedical AI Agent
A general-purpose biomedical AI agent to automate biomedical research.
biomni.stanford.edu
May 29, 2025 at 5:26 PM
🔧We built a web platform where biomedical scientists can immediately delegate their tasks to the agent today, completely free!
May 29, 2025 at 5:26 PM
Powered by:
🧪 Biomni-E1 – the first unified environment for biomedical agent w/ 150 tools, 59 databases, 106 software, systematically curated by mining 2.5K papers in biorxiv
🧠 Biomni-A1 – a generalist agent architecture with retrieval, planning + code as action
May 29, 2025 at 5:26 PM
🧫 Human-level performance on LAB-bench DbQA and SeqQA, with SOTA across 8 new biomedical tasks—ranging from GWAS and rare disease diagnosis to microbiology and drug repurposing.
May 29, 2025 at 5:26 PM
Key results:
🔬 Designed a cloning experiment with real-world wet-lab validation; on par with 5+ year expert in a blind test
📊 Ran 458-file wearable bioinformatics analysis in 35 min vs. 3 weeks for human expert
🧠 Uncovered novel TFs regulating skeletal lineages on scRNA+scATAC data
May 29, 2025 at 5:26 PM
A huge shout out to the amazing team!
Tony Zeng, Soner Koc, Alexandra Pettet, Jingtian Zhou, Mika Jain, Dongbo Sun, Camilo Ruiz, Hongyu Ren, Laurence Howe, Tom Richardson, Adrian Cortes, Katie Aiello, Kim Branson, Andreas Pfenning, Jesse Engreitz, Martin Zhang, Jure Leskovec
December 9, 2024 at 5:42 PM
15/15 Paper: medrxiv.org/content/10.1...
Everything is open-sourced at github.com/snap-stanfor...
Talk at Stanford Graph Learning workshop: youtu.be/0_jdg7FqSE4?...

Also, happy to share KGWAS-preview has also won Best poster award at Stanford Bio-X and Reviewer's choice award at ASHG!
Small-cohort GWAS discovery with AI over massive functional genomics knowledge graph
Genome-wide association studies (GWASs) have identified tens of thousands of disease associated variants and provided critical insights into developing effective treatments. However, limited sample si...
medrxiv.org
December 9, 2024 at 5:42 PM
14/15 We built a human-AI interaction UI, available at kgwas.stanford.edu where geneticists can explore KGWAS novel variants, genes, cells, and networks across a wide range of diseases!
December 9, 2024 at 5:42 PM
13/15 KGWAS with MAGMA prioritized disease genes with a 28.3% higher replication rate than GWAS-genes, aligning well with drug targets across 15 diseases. Combined with scDRS, KGWAS identified 44.3% more disease-critical cells across 120 cell types, including a novel B-cell cluster linked to asthma.
December 9, 2024 at 5:42 PM
12/15 Using a graph XAI method, KGWAS learns disease-specific networks where retained edges explain GWAS signals. These networks align with simulations, expert annotations, and Perturb-seq data. We further applied this approach to Alzheimer’s disease to interpret variant mechanisms!
December 9, 2024 at 5:42 PM
11/15 GWAS is fundamental and many works have built upon GWAS sumstats - could improvement in KGWAS sumstats translate to downstream GWAS tasks? We illustrate it by using KGWAS to do variant interpretation, disease gene prioritization, and disease -critical cell population detection!
December 9, 2024 at 5:42 PM
10/15 KGWAS revealed novel associations with strong functional evidence! For example, rs2155219 (11q13) was linked to ulcerative colitis via LRRC32 regulation in CD4+ regulatory T cells, and rs7312765 (12q12) to myasthenia gravis via PPHLN1 regulation in neuron-related cells.
December 9, 2024 at 5:42 PM
9/15 We then applied KGWAS to 544 uncommon diseases in the UK Biobank (<5K cases), and found 184 more (46.9% more) associations! If we zoom into 144 rare diseases, the gain goes to 79.8% and KGWAS also made >=1 hits for 92 diseases with zero hits in GWAS!
December 9, 2024 at 5:42 PM
8/15 To study small-cohort GWAS, we downsampled 21 UK Biobank traits (N=374K) to 1–10K samples, applied GWAS methods, and measured replication in the full cohort. KGWAS doubled GWAS power at 1K samples and proved more data-efficient, requiring up to 2.6X fewer samples to match variant associations!
December 9, 2024 at 5:42 PM
7/15 Extensive null and causal simulations under various configurations (heritability, num. of causal variants) establish that KGWAS is well-calibrated in false discovery control and also has significantly improved power!
December 9, 2024 at 5:42 PM
6/15 For each disease, we train a GNN to propagate GWAS signals across the KG, using predicted variant associations as priors that capture local GWAS signal concentration, guided by an LD-aware loss. Adjusted p-values are then computed using Genovese et al.’s covariate-based weighting framework.
December 9, 2024 at 5:42 PM
5/15 We aim to build an AI model integrating multi-modal functional genomics data with GWAS. Using a knowledge graph, we mapped variants-to-genes-to-programs with over 10M nodes, 55 relation types, and rich annotations like PoPS, baselineLD, and embeddings from models like Enformer and ESM.
December 9, 2024 at 5:42 PM
4/15 How can we uncover novel associations for small-cohort diseases? Emerging functional genomics data (e.g., QTLs, ABC, Hi-C, PPI, Perturb-seq) provide an increasingly clear picture of the cellular functions of a variant. Leveraging this data could intuitively enhance GWAS power!
December 9, 2024 at 5:42 PM
3/15 However, GWAS requires huge sample sizes, while many diseases of interest, especially rare and uncommon ones, have small cohorts (~hundreds to a few thousand). They often have few significant associations. Yet, these are often diseases with the greatest therapeutic unmet need!
December 9, 2024 at 5:42 PM
2/15 GWAS performs statistical tests to identify significant genetic variants that are associated with disease (e.g., P<5e-8) by scanning across individuals in large populations. GWAS is crucial—genetics-backed drugs are 2.6X more likely to succeed in trials!
December 9, 2024 at 5:42 PM