David van Dijk
@vandijklab.bsky.social
Learning the rules of life.
Assistant Professor of Medicine and Computer Science @ Yale
Assistant Professor of Medicine and Computer Science @ Yale
What if LLMs could “read” & “write” biology? 🤔
Introducing C2S‑Scale—a Yale and Google collab: we scaled LLMs (up to 27B!) to analyze & generate single‑cell data 🧬 ➡️ 📝
🔗 Blog: research.google/blog/teachin...
🔗 Preprint: biorxiv.org/content/10.1...
Introducing C2S‑Scale—a Yale and Google collab: we scaled LLMs (up to 27B!) to analyze & generate single‑cell data 🧬 ➡️ 📝
🔗 Blog: research.google/blog/teachin...
🔗 Preprint: biorxiv.org/content/10.1...
Teaching machines the language of biology: Scaling large language models for next-generation single-cell analysis
research.google
April 18, 2025 at 2:14 PM
What if LLMs could “read” & “write” biology? 🤔
Introducing C2S‑Scale—a Yale and Google collab: we scaled LLMs (up to 27B!) to analyze & generate single‑cell data 🧬 ➡️ 📝
🔗 Blog: research.google/blog/teachin...
🔗 Preprint: biorxiv.org/content/10.1...
Introducing C2S‑Scale—a Yale and Google collab: we scaled LLMs (up to 27B!) to analyze & generate single‑cell data 🧬 ➡️ 📝
🔗 Blog: research.google/blog/teachin...
🔗 Preprint: biorxiv.org/content/10.1...
Excited to share our new preprint: COAST: Intelligent Time-Adaptive Neural Operators! 🌊 We introduce a novel neural operator that learns to dynamically and intelligently adjust time step sizes for modeling dynamical systems from data. 🚀 doi.org/10.48550/arX...
February 13, 2025 at 7:23 PM
Excited to share our new preprint: COAST: Intelligent Time-Adaptive Neural Operators! 🌊 We introduce a novel neural operator that learns to dynamically and intelligently adjust time step sizes for modeling dynamical systems from data. 🚀 doi.org/10.48550/arX...
🔥🧠🌌 Now accepted at #ICLR2025 !
How does complexity shape intelligence? 🤔
In our new paper "Intelligence at the Edge of Chaos", we explore the relationship between complex systems and the emergence of intelligence in AI models. Can complexity alone unlock smarter systems?
arxiv.org/abs/2410.02536
How does complexity shape intelligence? 🤔
In our new paper "Intelligence at the Edge of Chaos", we explore the relationship between complex systems and the emergence of intelligence in AI models. Can complexity alone unlock smarter systems?
arxiv.org/abs/2410.02536
February 12, 2025 at 8:38 AM
🔥🧠🌌 Now accepted at #ICLR2025 !
How does complexity shape intelligence? 🤔
In our new paper "Intelligence at the Edge of Chaos", we explore the relationship between complex systems and the emergence of intelligence in AI models. Can complexity alone unlock smarter systems?
arxiv.org/abs/2410.02536
How does complexity shape intelligence? 🤔
In our new paper "Intelligence at the Edge of Chaos", we explore the relationship between complex systems and the emergence of intelligence in AI models. Can complexity alone unlock smarter systems?
arxiv.org/abs/2410.02536
Reposted by David van Dijk
Can we learn protein biology from a language model?
In new work led by @liambai.bsky.social and me, we explore how sparse autoencoders can help us understand biology—going from mechanistic interpretability to mechanistic biology.
In new work led by @liambai.bsky.social and me, we explore how sparse autoencoders can help us understand biology—going from mechanistic interpretability to mechanistic biology.
February 10, 2025 at 4:12 PM
Can we learn protein biology from a language model?
In new work led by @liambai.bsky.social and me, we explore how sparse autoencoders can help us understand biology—going from mechanistic interpretability to mechanistic biology.
In new work led by @liambai.bsky.social and me, we explore how sparse autoencoders can help us understand biology—going from mechanistic interpretability to mechanistic biology.
Reposted by David van Dijk
scGPT-spatial: Continual Pretraining of Single-Cell Foundation Model for Spatial Transcriptomics https://www.biorxiv.org/content/10.1101/2025.02.05.636714v1 🧬🖥️🧪 https://github.com/bowang-lab/scGPT-spatial
February 10, 2025 at 7:30 PM
scGPT-spatial: Continual Pretraining of Single-Cell Foundation Model for Spatial Transcriptomics https://www.biorxiv.org/content/10.1101/2025.02.05.636714v1 🧬🖥️🧪 https://github.com/bowang-lab/scGPT-spatial
Reposted by David van Dijk
Excited to post my first #skySplain about our recent work published yesterday in Nature Genetics! www.nature.com/articles/s41.... One of the first authors of this study - Annika Vannan – actually wrote this breakdown, but she's not yet over here on bluesky and asked me to post!
Spatial transcriptomics identifies molecular niche dysregulation associated with distal lung remodeling in pulmonary fibrosis - Nature Genetics
Xenium spatial transcriptomic profiling of pulmonary fibrosis characterizes cell composition dynamics and histopathological features associated with the disease.
www.nature.com
February 4, 2025 at 2:11 PM
Excited to post my first #skySplain about our recent work published yesterday in Nature Genetics! www.nature.com/articles/s41.... One of the first authors of this study - Annika Vannan – actually wrote this breakdown, but she's not yet over here on bluesky and asked me to post!
Reposted by David van Dijk
New: The largest medical A.I. randomized controlled trial yet performed, enrolling >100,000 women undergoing mammography screening
The use of AI led to 29% higher detection of cancer, no increase of false positives, and reduced workload compared with radiologists w/o AI thelancet.com/journals/lan...
The use of AI led to 29% higher detection of cancer, no increase of false positives, and reduced workload compared with radiologists w/o AI thelancet.com/journals/lan...
Screening performance and characteristics of breast cancer detected in the Mammography Screening with Artificial Intelligence trial (MASAI): a randomised, controlled, parallel-group, non-inferiority, ...
The findings suggest that AI contributes to the early detection of clinically relevant
breast cancer and reduces screen-reading workload without increasing false positives.
thelancet.com
February 4, 2025 at 3:00 AM
New: The largest medical A.I. randomized controlled trial yet performed, enrolling >100,000 women undergoing mammography screening
The use of AI led to 29% higher detection of cancer, no increase of false positives, and reduced workload compared with radiologists w/o AI thelancet.com/journals/lan...
The use of AI led to 29% higher detection of cancer, no increase of false positives, and reduced workload compared with radiologists w/o AI thelancet.com/journals/lan...
Reposted by David van Dijk
Very cool work showing the promise of integrating several sources of data to directly address human health conditions.
Synchronized long-read genome, methylome, epigenome and transcriptome profiling resolve a Mendelian condition
www.nature.com/articles/s41...
Synchronized long-read genome, methylome, epigenome and transcriptome profiling resolve a Mendelian condition
www.nature.com/articles/s41...
Synchronized long-read genome, methylome, epigenome and transcriptome profiling resolve a Mendelian condition - Nature Genetics
Simultaneous profiling of the genome, methylome, epigenome and transcriptome using single-molecule chromatin fiber sequencing and multiplexed arrays isoform sequencing identifies the genetic and molec...
www.nature.com
February 3, 2025 at 8:57 PM
Very cool work showing the promise of integrating several sources of data to directly address human health conditions.
Synchronized long-read genome, methylome, epigenome and transcriptome profiling resolve a Mendelian condition
www.nature.com/articles/s41...
Synchronized long-read genome, methylome, epigenome and transcriptome profiling resolve a Mendelian condition
www.nature.com/articles/s41...
Reposted by David van Dijk
Omega-3 fatty acids had a small protective effect of slowing biological aging (via multiple epigenetic clocks, Figure) nature.com/articles/s43...
Exercise and Vit D were also assessed with some additive benefits but not significant on their own
Exercise and Vit D were also assessed with some additive benefits but not significant on their own
February 3, 2025 at 4:44 PM
Omega-3 fatty acids had a small protective effect of slowing biological aging (via multiple epigenetic clocks, Figure) nature.com/articles/s43...
Exercise and Vit D were also assessed with some additive benefits but not significant on their own
Exercise and Vit D were also assessed with some additive benefits but not significant on their own
Reposted by David van Dijk
Benchmarking gene embeddings from sequence, expression, network, and text models for functional prediction tasks https://www.biorxiv.org/content/10.1101/2025.01.29.635607v1 🧬🖥️🧪 https://github.com/ylaboratory/gene-embedding-benchmarks
February 3, 2025 at 4:30 PM
Benchmarking gene embeddings from sequence, expression, network, and text models for functional prediction tasks https://www.biorxiv.org/content/10.1101/2025.01.29.635607v1 🧬🖥️🧪 https://github.com/ylaboratory/gene-embedding-benchmarks
Reposted by David van Dijk
Fruitful collaboration between our team & Sven Jager/Ziv Bar Joseph's team @Sanofi published today @narjournal.bsky.social: a language model that learns the grammar of all regions of an mRNA from head to tail!! Can be fine-tuned on all of your favorite mRNA-related tasks -- a successor of CodonBERT.
mRNA-LM: full-length integrated SLM for mRNA analysis
Abstract. The success of SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) messenger RNA (mRNA) vaccine has led to increased interest in the des
academic.oup.com
February 3, 2025 at 4:24 PM
Fruitful collaboration between our team & Sven Jager/Ziv Bar Joseph's team @Sanofi published today @narjournal.bsky.social: a language model that learns the grammar of all regions of an mRNA from head to tail!! Can be fine-tuned on all of your favorite mRNA-related tasks -- a successor of CodonBERT.
Reposted by David van Dijk
Our op-ed @nytopinion.nytimes.com today addresses the surprising results of recent medical studies that showed A.I. alone outperformed physicians using A.I. W/@rajpurkar.bsky.social
Here is a summary Table, an overview, and a gift link to the op-ed
erictopol.substack.com/p/when-docto...
Here is a summary Table, an overview, and a gift link to the op-ed
erictopol.substack.com/p/when-docto...
When Doctors With A.I. Are Outperformed by A.I. Alone
Interpreting Some Surprising Results
erictopol.substack.com
February 2, 2025 at 5:57 PM
Our op-ed @nytopinion.nytimes.com today addresses the surprising results of recent medical studies that showed A.I. alone outperformed physicians using A.I. W/@rajpurkar.bsky.social
Here is a summary Table, an overview, and a gift link to the op-ed
erictopol.substack.com/p/when-docto...
Here is a summary Table, an overview, and a gift link to the op-ed
erictopol.substack.com/p/when-docto...
We do see strong scaling laws in single cell foundation models
Genomic Foundationless Models: Pretraining Does Not Promise Performance
I've long believed genomic foundation models are not as useful as claimed. In my mind, there isn't enough training data to justify their size. Interesting to see more work in this direction.
www.biorxiv.org/content/10.1...
I've long believed genomic foundation models are not as useful as claimed. In my mind, there isn't enough training data to justify their size. Interesting to see more work in this direction.
www.biorxiv.org/content/10.1...
Genomic Foundationless Models: Pretraining Does Not Promise Performance
The success of Large Language Models has inspired the development of Genomic Foundation Models (GFMs) through similar pretraining techniques. However, the relationship between pretraining performance ...
www.biorxiv.org
February 3, 2025 at 10:11 AM
We do see strong scaling laws in single cell foundation models
Reposted by David van Dijk
A month ago we @vevotherapeutics.bsky.social announced that we have generated the largest single-cell perturbation atlas in history, Tahoe-100M. Today, we announce that we will fully open-source Tahoe-100M in Feb, as part of a collaboration with NVidia health to train cell state models.
January 13, 2025 at 4:23 PM
A month ago we @vevotherapeutics.bsky.social announced that we have generated the largest single-cell perturbation atlas in history, Tahoe-100M. Today, we announce that we will fully open-source Tahoe-100M in Feb, as part of a collaboration with NVidia health to train cell state models.
Reposted by David van Dijk
Excited to share our new study of #genomics🧬, #EHR📈 + treatment outcomes of 78K patients across 20 #cancers!
We identified >700 mutations predicting which drugs💊 are effective for individual patients. Our #ML model predicts who responds well to immunotherapies. #precisioncancer
We identified >700 mutations predicting which drugs💊 are effective for individual patients. Our #ML model predicts who responds well to immunotherapies. #precisioncancer
January 8, 2025 at 3:45 PM
Excited to share our new study of #genomics🧬, #EHR📈 + treatment outcomes of 78K patients across 20 #cancers!
We identified >700 mutations predicting which drugs💊 are effective for individual patients. Our #ML model predicts who responds well to immunotherapies. #precisioncancer
We identified >700 mutations predicting which drugs💊 are effective for individual patients. Our #ML model predicts who responds well to immunotherapies. #precisioncancer
Reposted by David van Dijk
Our ChromBPNet preprint out!
www.biorxiv.org/content/10.1...
Huge congrats to Anusri! This was quite a slog (for both of us) but we r very proud of this one! It is a long read but worth it IMHO. Methods r in the supp. materials. Bluetorial coming soon below 1/
www.biorxiv.org/content/10.1...
Huge congrats to Anusri! This was quite a slog (for both of us) but we r very proud of this one! It is a long read but worth it IMHO. Methods r in the supp. materials. Bluetorial coming soon below 1/
December 25, 2024 at 11:48 PM
Our ChromBPNet preprint out!
www.biorxiv.org/content/10.1...
Huge congrats to Anusri! This was quite a slog (for both of us) but we r very proud of this one! It is a long read but worth it IMHO. Methods r in the supp. materials. Bluetorial coming soon below 1/
www.biorxiv.org/content/10.1...
Huge congrats to Anusri! This was quite a slog (for both of us) but we r very proud of this one! It is a long read but worth it IMHO. Methods r in the supp. materials. Bluetorial coming soon below 1/
Reposted by David van Dijk
First 🦋 post! Very excited to share our lab's latest work exploring the shared and divergent aspects of human astrocyte development and glioblastoma. This effort spans many fields from developmental #glial biology to #stem cells and #tumor biology. rdcu.be/d5ADz A 🧵
Mapping the developmental trajectory of human astrocytes reveals divergence in glioblastoma
Nature Cell Biology - Sojka et al. analyse the transcriptomic and epigenomic landscape of human astrocyte maturation and identify an epigenetically regulated intermediate state associated with...
rdcu.be
January 8, 2025 at 5:13 PM
First 🦋 post! Very excited to share our lab's latest work exploring the shared and divergent aspects of human astrocyte development and glioblastoma. This effort spans many fields from developmental #glial biology to #stem cells and #tumor biology. rdcu.be/d5ADz A 🧵
Reposted by David van Dijk
Out today in @naturegenet.bsky.social -- PERFF-seq! With @tsionabay.bsky.social , @ronanchaligne.bsky.social, Bob Stickels, Meril Takizawa, + Ansu Satpathy, we describe this new assay to study rare populations with programmable nucleic acid cytometry. 1/n
www.nature.com/articles/s41...
www.nature.com/articles/s41...
Transcript-specific enrichment enables profiling of rare cell states via single-cell RNA sequencing - Nature Genetics
Programmable Enrichment via RNA FlowFISH by sequencing (PERFF-seq) isolates rare cells based on RNA marker transcripts for single-cell RNA sequencing profiling of complex tissues, with applicability t...
www.nature.com
January 8, 2025 at 1:30 PM
Out today in @naturegenet.bsky.social -- PERFF-seq! With @tsionabay.bsky.social , @ronanchaligne.bsky.social, Bob Stickels, Meril Takizawa, + Ansu Satpathy, we describe this new assay to study rare populations with programmable nucleic acid cytometry. 1/n
www.nature.com/articles/s41...
www.nature.com/articles/s41...
Reposted by David van Dijk
Seeking pioneering scientists in immunology, neuroscience, and machine learning! Join Arc Institute as a Core Investigator + Stanford Bioengineering as Associate/Full Professor. Full lab funding and cutting-edge facilities in Palo Alto. Apply by Jan 15 for full consideration: shorturl.at/CuH1K
Associate or Full Professor Rank Search to join the Arc Institute as a Core Investigator and the Stanford Department of Bioengineering | Arc Institute
Arc Institute is a independent nonprofit research organization headquartered in Palo Alto, California.
arcinstitute.org
January 8, 2025 at 11:44 PM
Seeking pioneering scientists in immunology, neuroscience, and machine learning! Join Arc Institute as a Core Investigator + Stanford Bioengineering as Associate/Full Professor. Full lab funding and cutting-edge facilities in Palo Alto. Apply by Jan 15 for full consideration: shorturl.at/CuH1K