Johnny Yu
thejohnnyyu.bsky.social
Johnny Yu
@thejohnnyyu.bsky.social
CSO & co-founder at vevo.ai | Former UCSF, Goodarzi Lab, Broad, Biogen
Hey would a presentation on tahoe100 be a good one for this?
March 1, 2025 at 5:49 PM
We had a lot of fun with this project. From idea to preprint it was only 4 months! ⚡
March 1, 2025 at 5:14 PM
Team work makes the dream work!
March 1, 2025 at 5:11 PM
Just the start of a movement
March 1, 2025 at 5:10 PM
Yup split pool was an early key choice we made. Data is really not bad though, 1tb and you'll probably see some ultra dask type data structures to remove ram limitations super soon. So just solid state hdd and that'll be easy for a 1tb dataset
March 1, 2025 at 5:09 PM
This is a good question
March 1, 2025 at 5:07 PM
Let's go 🚀🌕
March 1, 2025 at 5:06 PM
We're excited! Stay tuned as this year several models built on this data are going to come out in quick succession 🙂
March 1, 2025 at 5:05 PM
Keep an eye on this - we're just getting started! Hope to have the ML community engaged as we continue in this direction 🚀
March 1, 2025 at 5:03 PM
Reposted by Johnny Yu
This was all made possible by the Mosaic platform! What is Mosaic? @thejohnnyyu.bsky.social took his work in our lab, and scaled it in every dimension… Mosaic brings a highly diversified, exquisitely optimized, and optimally balanced “cell village” approach to perturbation data collection.
February 25, 2025 at 1:25 PM
Reposted by Johnny Yu
If you are intrigued by this, and if you're working on AI/ML, single-cell biology, or drug discovery, I urge y’all to reach out to @thejohnnyyu.bsky.social, @therealnima.bsky.social or any of the @vevotherapeutics.bsky.social team. www.prnewswire.com/news-release...
Vevo Therapeutics Open Sources Tahoe-100M, the World's Largest Single-Cell Dataset, as the Inaugural Contribution to Arc Institute's New Virtual Cell Atlas
/PRNewswire/ -- In a landmark move to advance AI-driven biological research, Arc Institute and Vevo Therapeutics announced today that they have partnered on...
www.prnewswire.com
February 25, 2025 at 1:25 PM
This will be instrumental for data sets like our Tahoe 100 million. Especially as we scale into normalizing 100 million cell data sets
December 11, 2024 at 4:28 PM
keep in touch!
December 5, 2024 at 11:01 PM
100M dataset!
December 5, 2024 at 3:42 PM