PhD student @ CMU with Zico Kolter and Zack Lipton | Founding Member @datologyai.com | Prev. Comp Sc @iitdelhi
http://pratyushmaini.github.io/
"common infra" includes question templates, topics, styles, annotators, etc.
> common annotators being the least privileged access.
"common infra" includes question templates, topics, styles, annotators, etc.
> common annotators being the least privileged access.
(Risk 1): There is a massive financial incentive for such companies to design evals that even marginally favor their own customers.
(Risk 1): There is a massive financial incentive for such companies to design evals that even marginally favor their own customers.
We argue that this shift poses new risks including financial incentives & eval bias.
w/ @hbxnov.bsky.social
📝: pratyushmaini.github.io/blog/2024/ri... 🧵
We argue that this shift poses new risks including financial incentives & eval bias.
w/ @hbxnov.bsky.social
📝: pratyushmaini.github.io/blog/2024/ri... 🧵
Case in point: #EMNLP2024 ’s Best Paper Award.
I & @iamgroot42.bsky.social wrote a blog on what went wrong: www.anshumansuri.com/blog/2024/ca... 🧵
Case in point: #EMNLP2024 ’s Best Paper Award.
I & @iamgroot42.bsky.social wrote a blog on what went wrong: www.anshumansuri.com/blog/2024/ca... 🧵
Our models trained on curated data saw:
• 4.4% better than DCLM.
• 2x faster training than FW-edu
• Our 1.3B model outperforms 2.7B models trained on DCLM & FW-edu
Our models trained on curated data saw:
• 4.4% better than DCLM.
• 2x faster training than FW-edu
• Our 1.3B model outperforms 2.7B models trained on DCLM & FW-edu
Techvember Ep 2: How we made the #1 LLM Pre-training Data Recipe.
Blog: 👉 tinyurl.com/best-llm-data 🧵
Techvember Ep 2: How we made the #1 LLM Pre-training Data Recipe.
Blog: 👉 tinyurl.com/best-llm-data 🧵
pratyushmaini.github.io/cmu-10-799
pratyushmaini.github.io/cmu-10-799