Guilherme Penedo
guilherme.hf.co
Guilherme Penedo
@guilherme.hf.co
ML Research Engineer at 🤗. Lisboeta 🇵🇹
We will very soon announce a big community project, and are working on a 📝 blogpost walking you through the entire dataset creation process. Stay tuned!
December 8, 2024 at 9:19 AM
The dataset is released under the permissive 📜 ODC-By 1.0 license, and the 💻 code to reproduce it and our evaluations is public.

Find out all about 🥂 FineWeb2 on the 🤗 model page:
huggingface.co/datasets/Hug...
HuggingFaceFW/fineweb-2 · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
December 8, 2024 at 9:19 AM