SIEVE improves upon existing quality filtering methods in the DataComp-LM challenge, producing better LLM pretraining data that led to improved model performance. This work is part of Jifan's broader research on efficient ML training, from active learning to label-efficient SFT for LLMs.
February 7, 2025 at 2:55 AM
SIEVE improves upon existing quality filtering methods in the DataComp-LM challenge, producing better LLM pretraining data that led to improved model performance. This work is part of Jifan's broader research on efficient ML training, from active learning to label-efficient SFT for LLMs.
SIEVE distills GPT-4's data filtering capabilities into lightweight models at <1% of the cost. Not just minor improvements - we're talking 500x more efficient filtering operations.
February 7, 2025 at 2:55 AM
SIEVE distills GPT-4's data filtering capabilities into lightweight models at <1% of the cost. Not just minor improvements - we're talking 500x more efficient filtering operations.
🧵 Heard all the buzz around distilling from OpenAI models? Check out @jifanz's latest work SIEVE - showing how strategic distillation can make LLM development radically more cost-effective while matching quality.
February 7, 2025 at 2:55 AM
🧵 Heard all the buzz around distilling from OpenAI models? Check out @jifanz's latest work SIEVE - showing how strategic distillation can make LLM development radically more cost-effective while matching quality.
There is a ton of interest in the question of whether AI can be be funny: www.bbc.com/future/artic.... Our paper at NeurIPS investigates the humor generation capabilities of the latest and greatest AI models using one of world’s largest humor datasets! arxiv.org/pdf/2406.10522
December 4, 2024 at 2:00 PM
There is a ton of interest in the question of whether AI can be be funny: www.bbc.com/future/artic.... Our paper at NeurIPS investigates the humor generation capabilities of the latest and greatest AI models using one of world’s largest humor datasets! arxiv.org/pdf/2406.10522