Techvember Ep 2: How we made the #1 LLM Pre-training Data Recipe.
Blog: 👉 tinyurl.com/best-llm-data 🧵
Techvember Ep 2: How we made the #1 LLM Pre-training Data Recipe.
Blog: 👉 tinyurl.com/best-llm-data 🧵
Wired: Bringing up @datologyai.com’s new text curation results at Thanksgiving
That’s right, we applied our data curation pipeline to text pretraining data and the results are hot enough to roast a 🦃
🧵
Wired: Bringing up @datologyai.com’s new text curation results at Thanksgiving
That’s right, we applied our data curation pipeline to text pretraining data and the results are hot enough to roast a 🦃
🧵
Wired: Bringing up @datologyai.com’s new text curation results at Thanksgiving
That’s right, we applied our data curation pipeline to text pretraining data and the results are hot enough to roast a 🦃
🧵
building a state-of-the-art data curation pipeline and I’m SO excited to share our first results: we curated image-text pretraining data and massively improved CLIP model quality, training speed, and inference efficiency 🔥🔥🔥
building a state-of-the-art data curation pipeline and I’m SO excited to share our first results: we curated image-text pretraining data and massively improved CLIP model quality, training speed, and inference efficiency 🔥🔥🔥