Moritz Laurer
banner
moritzlaurer.bsky.social
Moritz Laurer
@moritzlaurer.bsky.social
Machine Learning Engineer @hf.co Hugging Face
.@microsoft.com's rStar-Math paper claims that 🤏 ~7B models can match the math skills of o1 using clever train- and test-time techniques. You can now download their prompt templates from @hf.co !
🧵
January 15, 2025 at 12:31 PM
FACTS is a great paper from @GoogleDeepMind on measuring the factuality of LLM outputs. You can now download their prompt templates from @huggingface to improve LLM-based fact-checking yourself!
🧵
January 11, 2025 at 11:14 AM
The TRL v0.13 release is 🔥! My highlight are the new process reward trainer to train models similar to o1 and tool call support:

🧠 Process reward trainer: Enables training of Process-supervised Reward Models (PRMs), which reward the quality of intermediate steps, promoting structured reasoning.
January 9, 2025 at 1:05 PM
OpenAI is losing money on the $200/month subscription 🤯. It's crazy how expensive it is to run these largest LLMs:

- ChatGPT Pro costs $200/month ($2,400/year) and is still unprofitable for OpenAI due to higher-than-expected usage.
- OpenAI reportedly expected losses of about $5 billion
January 7, 2025 at 11:12 AM
🚀 Releasing a new zeroshot-classifier based on ModernBERT! Some key takeaways:

- ⚡ Speed & efficiency: It's multiple times faster and uses significantly less memory than DeBERTav3. You can use larger batch sizes and enabling bf16 (instead of fp16) gave me a ~2x speed boost
- 📉 Performance tradeoff:
January 6, 2025 at 4:40 PM
Quite excited by the ModernBERT release! 0.15/0.4B small, 2T modern pre-training data and tokenizer with code, 8k context window, great efficient model for embeddings & classification!

This will probably be the basis for many future SOTA encoders! I can finally stop using DeBERTav3 2021 :D
December 20, 2024 at 2:21 PM
"Open-source AI: year in review 2024": amazing Space with lots of data-driven insights into AI in 2024! Check it out 👇
December 17, 2024 at 3:40 PM
I've been building a small library for working with prompt templates on the @huggingface.bsky.social Hub: `pip install prompt-templates`. Motivation:

The community currently shares prompt templates in a wide variety of formats: in datasets, in model cards, as strings in .py files, as .txt/... 🧵
December 12, 2024 at 3:58 PM
Reposted by Moritz Laurer
Performance leap: TGI v3 is out. Processes 3x more tokens, 13x faster than vLLM on long prompts. Zero config !
December 10, 2024 at 10:08 AM
Reposted by Moritz Laurer
🚀 We’re excited to announce Argilla v2.5.0, which includes:
* Argilla webhooks,
* A new design for the datasets home page.
* Python 3.13 and Pydantic v2 support.
📙 Read here 👇 the full release notes

github.com/argilla-io/a...
Release v2.5.0 · argilla-io/argilla
🔆 Release highlights Webhooks You can now create and manage webhooks to support your workflows! Webhooks allow you to submit real-time information to other applications whenever a specific event oc...
github.com
November 29, 2024 at 12:55 PM
Reposted by Moritz Laurer
🐍📰 Hugging Face Transformers: Leverage Open-Source AI in Python

##python

realpython.com/huggingface-...
November 24, 2024 at 6:30 PM