Yusuf Roohani
yusufroohani.bsky.social
Yusuf Roohani
@yusufroohani.bsky.social
Machine Learning & Systems Biology. ML Group Leader @arcinstitute. PhD @StanfordAILab

http://www.yusufroohani.com
We're hiring! Come join the team and scale new heights with us! 🏔️

arcinstitute.org/jobs
February 25, 2025 at 2:35 PM
Uniform processing lowers technical variation between scBaseCamp datasets.

Technical factors such as library chemistry and suspension type (single-cell vs single-nucleus) exhibited comparable or lower silhouette scores than biologically meaningful categories like tissue type
February 25, 2025 at 2:35 PM
scBaseCamp is the first large biological data repository curated by an AI agent

We built a hierarchical agentic workflow (SRAgent) to automate discovery, metadata extraction & data processing

It is consistent, easily scalable and automatically updates when new data is available
February 25, 2025 at 2:35 PM
scBaseCamp was built by directly mining all publicly accessible 10X Genomics scRNAseq data from the Sequence Read Archive (SRA)

With over 230M cells drawn from 21 species and 72 tissues, scBaseCamp is significantly larger and more diverse than existing single-cell data repositories
February 25, 2025 at 2:35 PM
At the @arcinstitute.org we are building AI models of cell state from the ground up, rethinking every step, from data generation to biologically relevant evaluation

Today we launch scBaseCamp, the largest public repository of single cell RNAseq data, uniformly processed from raw sequencing reads.
February 25, 2025 at 2:35 PM