Feel free to check:
🤗 hf.co/datasets/LLM...
📝 hf.co/papers/2504....
💻 github.com/LLM360/MegaM...
Feel free to check:
🤗 hf.co/datasets/LLM...
📝 hf.co/papers/2504....
💻 github.com/LLM360/MegaM...
Head-to-head comparisons with open datasets show that MegaMath leads in both quality and scale: MegaMath-Web-Pro outperforms FineMath-4plus by an absolute 4% gain! Plus, MegaMath-Llama-1B and 3B — beat their Base counterparts by up to 20% across 10 benchmarks.
Head-to-head comparisons with open datasets show that MegaMath leads in both quality and scale: MegaMath-Web-Pro outperforms FineMath-4plus by an absolute 4% gain! Plus, MegaMath-Llama-1B and 3B — beat their Base counterparts by up to 20% across 10 benchmarks.
For the web domain, we downloaded all Common Crawl, developed an optimized HTML reformatting pipeline, trained a robust FastText-based filter, performed 2-stage extraction, and explored optimal deduplication strategies.
For the web domain, we downloaded all Common Crawl, developed an optimized HTML reformatting pipeline, trained a robust FastText-based filter, performed 2-stage extraction, and explored optimal deduplication strategies.
We’ve got numerous general pre-training datasets, but still lack a large math pre-training dataset that scales to >100B tokens. And here you are: MegaMath covers diverse math categories, suitable for pre-training, continual training, mid-training — flexible across various demands.
We’ve got numerous general pre-training datasets, but still lack a large math pre-training dataset that scales to >100B tokens. And here you are: MegaMath covers diverse math categories, suitable for pre-training, continual training, mid-training — flexible across various demands.