Leandro von Werra
lvwerra.bsky.social
Leandro von Werra
@lvwerra.bsky.social
Research @ Hugging Face
Checkout the code and super detailed video walkthrough!

Code: github.com/huggingface/...

Video: youtube.com/playlist?lis...

Work lead by Haojun Zhao and Ferdinand Mom!
GitHub - huggingface/picotron: Minimalistic 4D-parallelism distributed training framework for education purpose
Minimalistic 4D-parallelism distributed training framework for education purpose - huggingface/picotron
github.com
January 6, 2025 at 4:51 PM
Or watch how the model solves the Lokta-Volterra equation and plots the results and refines them.

Try it out: huggingface.co/spaces/data-...
December 19, 2024 at 6:56 PM
💔
December 14, 2024 at 8:12 PM
Looks more like a rave!
November 22, 2024 at 9:20 AM
Slides here: docs.google.com/presentation...

Inspired by the nice talk from @thomwolf.bsky.social earlier this year and updated with some material we are working on right now:

www.youtube.com/watch?v=2-SP...
A little guide to building Large Language Models in 2024
YouTube video by ThomWolf
www.youtube.com
November 19, 2024 at 8:37 PM