Lorin Crawford
banner
lcrawford.bsky.social
Lorin Crawford
@lcrawford.bsky.social
Principal Researcher in BioML at Microsoft Research.
Pinned
In our newest preprint, we show that simply increasing the size of pre-training datasets doesn't necessarily improve the performance of single-cell foundation models on downstream tasks.

Really proud of @alandenadel.bsky.social for leading this effort! See his thread below for more details👇
Current methods in the field are trained on atlases ranging from 1 to 100 million cells. In our newest preprint, we show that these same approaches tend to plateau in performance with pre-training datasets that are only a fraction of the size.
Alan did a ton of work on our latest revision. Message of the story still holds: more data does not necessarily lead to better downstream performance for many models.

Check out all our new analyses here: www.biorxiv.org/content/10.1...
November 7, 2025 at 8:42 PM
Come do an internship with us! Please apply
Are you a PhD student interested in ML and biology or health? Come do an internship with me, @avapamini.bsky.social, Alex Lu, @lcrawford.bsky.social, or Kristen Severson at MSRNE!

Applications are due Dec 1: make sure you include a research statement!

jobs.careers.microsoft.com/global/en/jo...
Search Jobs | Microsoft Careers
jobs.careers.microsoft.com
October 22, 2025 at 12:41 PM
Reposted by Lorin Crawford
Gindra, Palla, Nguyen, Wagner, Tran, Theis, Saur, Crawford, Peng: A Large-Scale Benchmark of Cross-Modal Learning for Histology and Gene Expression in Spatial Transcriptomics https://arxiv.org/abs/2508.01490 https://arxiv.org/pdf/2508.01490 https://arxiv.org/html/2508.01490
August 5, 2025 at 6:50 AM
Reposted by Lorin Crawford
📣Online now!
📄Sparse modeling of interactions enables fast detection of genome-wide epistasis in biobank-scale studies
🧑‍🤝‍🧑 @lcrawford.bsky.social @julian-stamp.bsky.social & co
Sparse modeling of interactions enables fast detection of genome-wide epistasis in biobank-scale studies
The sparse marginal epistasis test overcomes computational limitations of previous mapping approaches by focusing its epistatic search to regions of the genome that have some known functional relation...
www.cell.com
July 29, 2025 at 3:28 PM
Reposted by Lorin Crawford
Do protein language models store different structural elements in factorizable subnetworks?

To find out, we masked out PLM weights to suppress performance on CATH subcategories or secondary structure elements while maintaining performance on other sequences or residues.
June 2, 2025 at 3:03 PM
Reposted by Lorin Crawford
Members have elected Brian Millen as the 122nd president of the association.
Also elected: Julia Sharp, Vice President; Martin Slawski, Council of Sections Representative; Ruixiao Lu, Council of Chapters Representative; Pedro Silva, International Representative. Congratulations! tinyurl.com/ptunrn5z
May 15, 2025 at 12:54 PM
Reposted by Lorin Crawford
"Leadership in Trustworthy AI" COPSS-NISS Leadership Webinar with David Donoho (Stanford), Michael I. Jordan (UC Berkeley), and Tracy Ke (Harvard) - online Tuesday, April 29, 2025 at 12-1pm ET.

Register at: www.niss.org/events/copss...
April 17, 2025 at 9:53 PM
Great to see our paper presenting recall, a framework which calibrates clustering for the impact of data "double-dipping" in single-cell studies, out today in AJHG! Congratulations, @alandenadel.bsky.social and co-authors!
March 12, 2025 at 7:18 PM
Reposted by Lorin Crawford
New online! Adapting systems biology to address the complexity of human disease in the single-cell era
Adapting systems biology to address the complexity of human disease in the single-cell era
Nature Reviews Genetics, Published online: 10 March 2025; doi:10.1038/s41576-025-00821-6Differences between humans and experimental models create a translational gap that makes it difficult to extrapolate research findings. The authors review…
www.nature.com
March 10, 2025 at 12:33 PM
Reposted by Lorin Crawford
🔥 Benchmark Alert! MotifBench sets a new standard for evaluating protein design methods in motif scaffolding.
Why does this matter? Reproducibility & fair comparison have been lacking—until now.
Paper: arxiv.org/abs/2502.12479 | Repo: github.com/blt2114/Moti...
A thread ⬇️
February 19, 2025 at 8:50 PM
Reposted by Lorin Crawford
🎉Congratulations to the authors!🎉

The list of accepted papers for #RECOMB2025 is now live ➡️ recomb.org/recomb2025/a...

What are your plans for Seoul? Let us know!
www.youtube.com/watch?v=uztj...

#seoul
RECOMB 2025 | PROGRAM
RECOMB 2025 - Yonsei University, Seoul
recomb.org
January 29, 2025 at 9:22 AM
Quick post to close out the week - in our newest preprint,
@julian-stamp.bsky.social scales the marginal epistasis test to work on biobanks! The key is that using trait-specific information to induce sparsity in the modeled gene-interactions greatly improves both runtime and power
January 17, 2025 at 9:22 PM
Reposted by Lorin Crawford
Rethinking cancer drug synergy prediction: a call for standardization in machine learning applications https://www.biorxiv.org/content/10.1101/2024.12.24.630216v1
December 24, 2024 at 5:50 PM
In our newest preprint, we show that simply increasing the size of pre-training datasets doesn't necessarily improve the performance of single-cell foundation models on downstream tasks.

Really proud of @alandenadel.bsky.social for leading this effort! See his thread below for more details👇
Current methods in the field are trained on atlases ranging from 1 to 100 million cells. In our newest preprint, we show that these same approaches tend to plateau in performance with pre-training datasets that are only a fraction of the size.
December 18, 2024 at 7:54 PM
Reposted by Lorin Crawford
This starter pack is in honor of Assistant Professor Antentor Hinton Jr (AJ) at Vanderbilt School of Medicine - Basic Sciences. AJ curated the first 100 Inspiring Black Scientists in America list and in collaboration with the Community of Scholars expanded this list to the 1000.

go.bsky.app/DsJrwR
November 24, 2024 at 7:38 AM
Reposted by Lorin Crawford
Looking to find your MSR colleagues? I put together a starter pack.

Msg me if I missed you (or you just joined).

go.bsky.app/NxtpELZ
November 17, 2024 at 12:22 AM
In the spirit of trying to be more active on here... the deadline for our BioML internship is tomorrow 11/22.

If you are currently a PhD student please consider applying --- all we need is a CV and research statement!

jobs.careers.microsoft.com/global/en/jo...
Search Jobs | Microsoft Careers
jobs.careers.microsoft.com
November 21, 2024 at 5:57 PM