Kexin Huang
kexinhuang.bsky.social
Kexin Huang
@kexinhuang.bsky.social
PhD student at Stanford CS; AI+Biomedicine
kexinhuang.com
📢 Meet Biomni — the first general-purpose biomedical AI agent.

It automates literature review, hypothesis generation, protocol design, bioinformatics analysis, clinical decision support, and much more — scaling biomedical expertise for 100× discoveries.

biomni.stanford.edu

🧵
May 29, 2025 at 5:26 PM
14/15 We built a human-AI interaction UI, available at kgwas.stanford.edu where geneticists can explore KGWAS novel variants, genes, cells, and networks across a wide range of diseases!
December 9, 2024 at 5:42 PM
13/15 KGWAS with MAGMA prioritized disease genes with a 28.3% higher replication rate than GWAS-genes, aligning well with drug targets across 15 diseases. Combined with scDRS, KGWAS identified 44.3% more disease-critical cells across 120 cell types, including a novel B-cell cluster linked to asthma.
December 9, 2024 at 5:42 PM
12/15 Using a graph XAI method, KGWAS learns disease-specific networks where retained edges explain GWAS signals. These networks align with simulations, expert annotations, and Perturb-seq data. We further applied this approach to Alzheimer’s disease to interpret variant mechanisms!
December 9, 2024 at 5:42 PM
11/15 GWAS is fundamental and many works have built upon GWAS sumstats - could improvement in KGWAS sumstats translate to downstream GWAS tasks? We illustrate it by using KGWAS to do variant interpretation, disease gene prioritization, and disease -critical cell population detection!
December 9, 2024 at 5:42 PM
10/15 KGWAS revealed novel associations with strong functional evidence! For example, rs2155219 (11q13) was linked to ulcerative colitis via LRRC32 regulation in CD4+ regulatory T cells, and rs7312765 (12q12) to myasthenia gravis via PPHLN1 regulation in neuron-related cells.
December 9, 2024 at 5:42 PM
9/15 We then applied KGWAS to 544 uncommon diseases in the UK Biobank (<5K cases), and found 184 more (46.9% more) associations! If we zoom into 144 rare diseases, the gain goes to 79.8% and KGWAS also made >=1 hits for 92 diseases with zero hits in GWAS!
December 9, 2024 at 5:42 PM
8/15 To study small-cohort GWAS, we downsampled 21 UK Biobank traits (N=374K) to 1–10K samples, applied GWAS methods, and measured replication in the full cohort. KGWAS doubled GWAS power at 1K samples and proved more data-efficient, requiring up to 2.6X fewer samples to match variant associations!
December 9, 2024 at 5:42 PM
7/15 Extensive null and causal simulations under various configurations (heritability, num. of causal variants) establish that KGWAS is well-calibrated in false discovery control and also has significantly improved power!
December 9, 2024 at 5:42 PM
6/15 For each disease, we train a GNN to propagate GWAS signals across the KG, using predicted variant associations as priors that capture local GWAS signal concentration, guided by an LD-aware loss. Adjusted p-values are then computed using Genovese et al.’s covariate-based weighting framework.
December 9, 2024 at 5:42 PM
5/15 We aim to build an AI model integrating multi-modal functional genomics data with GWAS. Using a knowledge graph, we mapped variants-to-genes-to-programs with over 10M nodes, 55 relation types, and rich annotations like PoPS, baselineLD, and embeddings from models like Enformer and ESM.
December 9, 2024 at 5:42 PM
4/15 How can we uncover novel associations for small-cohort diseases? Emerging functional genomics data (e.g., QTLs, ABC, Hi-C, PPI, Perturb-seq) provide an increasingly clear picture of the cellular functions of a variant. Leveraging this data could intuitively enhance GWAS power!
December 9, 2024 at 5:42 PM
3/15 However, GWAS requires huge sample sizes, while many diseases of interest, especially rare and uncommon ones, have small cohorts (~hundreds to a few thousand). They often have few significant associations. Yet, these are often diseases with the greatest therapeutic unmet need!
December 9, 2024 at 5:42 PM
2/15 GWAS performs statistical tests to identify significant genetic variants that are associated with disease (e.g., P<5e-8) by scanning across individuals in large populations. GWAS is crucial—genetics-backed drugs are 2.6X more likely to succeed in trials!
December 9, 2024 at 5:42 PM
🧬 Thrilled to share Knowledge Graph GWAS (KGWAS), the largest AI model that integrates >10 millions of multi-modal and multi-scale functional genomics data to improve GWAS power by 100% while discovering novel disease-critical variants, genes, cells, and networks!

1/15🧵
December 9, 2024 at 5:42 PM