Moritz Laurer
banner
moritzlaurer.bsky.social
Moritz Laurer
@moritzlaurer.bsky.social
Machine Learning Engineer @hf.co Hugging Face
.@microsoft.com's rStar-Math paper claims that 🤏 ~7B models can match the math skills of o1 using clever train- and test-time techniques. You can now download their prompt templates from @hf.co !
🧵
January 15, 2025 at 12:31 PM
FACTS is a great paper from @GoogleDeepMind on measuring the factuality of LLM outputs. You can now download their prompt templates from @huggingface to improve LLM-based fact-checking yourself!
🧵
January 11, 2025 at 11:14 AM
The TRL v0.13 release is 🔥! My highlight are the new process reward trainer to train models similar to o1 and tool call support:

🧠 Process reward trainer: Enables training of Process-supervised Reward Models (PRMs), which reward the quality of intermediate steps, promoting structured reasoning.
January 9, 2025 at 1:05 PM
🚀 Releasing a new zeroshot-classifier based on ModernBERT! Some key takeaways:

- ⚡ Speed & efficiency: It's multiple times faster and uses significantly less memory than DeBERTav3. You can use larger batch sizes and enabling bf16 (instead of fp16) gave me a ~2x speed boost
- 📉 Performance tradeoff:
January 6, 2025 at 4:40 PM
Quite excited by the ModernBERT release! 0.15/0.4B small, 2T modern pre-training data and tokenizer with code, 8k context window, great efficient model for embeddings & classification!

This will probably be the basis for many future SOTA encoders! I can finally stop using DeBERTav3 2021 :D
December 20, 2024 at 2:21 PM
"Open-source AI: year in review 2024": amazing Space with lots of data-driven insights into AI in 2024! Check it out 👇
December 17, 2024 at 3:40 PM