Bases are now available in Obsidian 1.9.0 for early access users.
Bases are now available in Obsidian 1.9.0 for early access users.
It was originally a panorama of the new methods of synthetic generation but the stakes are now much higher and I openly wonder if model training is not soon going to change forever. vintagedata.org/blog/posts/t...
It was originally a panorama of the new methods of synthetic generation but the stakes are now much higher and I openly wonder if model training is not soon going to change forever. vintagedata.org/blog/posts/t...
Timo and I recently published a book, and even if you are not a scientist, you'll find useful overviews of topics like causality and robustness.
The best part is that you can read it for free: ml-science-book.com
Timo and I recently published a book, and even if you are not a scientist, you'll find useful overviews of topics like causality and robustness.
The best part is that you can read it for free: ml-science-book.com
But I think multiplication, addition, maze solving and easy-to-hard generalization is actually solvable on standard transformers...
with recursive self-improvement
Below is the acc of a tiny model teaching itself how to add and multiply
But I think multiplication, addition, maze solving and easy-to-hard generalization is actually solvable on standard transformers...
with recursive self-improvement
Below is the acc of a tiny model teaching itself how to add and multiply
I procrastinated on this because, honestly, who cares about my writing process? But after repeatedly answering the same qns, I finally wrote this.
eugeneyan.com/writing/writ...
I procrastinated on this because, honestly, who cares about my writing process? But after repeatedly answering the same qns, I finally wrote this.
eugeneyan.com/writing/writ...
Generate beautiful research blogs with figures, key insights, and clear explanations from the paper with just one click
Understand papers in minutes - not hours
Generate beautiful research blogs with figures, key insights, and clear explanations from the paper with just one click
Understand papers in minutes - not hours
Here: thomwolf.io/blog/scienti...
It's an extension of this interview discussion from the AI summit: youtu.be/AxBd3G0lFLs?...
Here: thomwolf.io/blog/scienti...
It's an extension of this interview discussion from the AI summit: youtu.be/AxBd3G0lFLs?...
• Google Dremel / BigQuery
• Snowflake
• Amazon Redshift
• Yellowbrick
• Databricks Photon
• @duckdb.org
• TabDB
• Google Dremel / BigQuery
• Snowflake
• Amazon Redshift
• Yellowbrick
• Databricks Photon
• @duckdb.org
• TabDB
Use DeepSeek to generate high-quality training data, then distil that knowledge into ModernBERT for fast, efficient classification.
New blog post: danielvanstrien.xyz/posts/2025/d...
Use DeepSeek to generate high-quality training data, then distil that knowledge into ModernBERT for fast, efficient classification.
New blog post: danielvanstrien.xyz/posts/2025/d...
PPO, GPRO, PRIME — doesn’t matter what RL you use, the key is that it’s RL
experiment logs: wandb.ai/jiayipan/Tin...
x: x.com/jiayi_pirate...
PPO, GPRO, PRIME — doesn’t matter what RL you use, the key is that it’s RL
experiment logs: wandb.ai/jiayipan/Tin...
x: x.com/jiayi_pirate...
with r1-zero, they took qwen and applied GRPO and only GRPO and got a model that does self-reflection (!!)
i'm learning this too, so let me take a whack at this...
with r1-zero, they took qwen and applied GRPO and only GRPO and got a model that does self-reflection (!!)
i'm learning this too, so let me take a whack at this...
Read more here:
Read more here:
Slides: phontron.com/class/anlp-f...
Videos: youtube.com/playlist?lis...
Hope this is useful to people 😀
Slides: phontron.com/class/anlp-f...
Videos: youtube.com/playlist?lis...
Hope this is useful to people 😀
www.youtube.com/watch?v=76gu...
www.youtube.com/watch?v=76gu...
We trained 2 new models. Like BERT, but modern. ModernBERT.
Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.
It's much faster, more accurate, longer context, and more useful. 🧵
We trained 2 new models. Like BERT, but modern. ModernBERT.
Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.
It's much faster, more accurate, longer context, and more useful. 🧵
Join here at 12:30 CET:
youtube.com/live/ZWo6Q85...
Join here at 12:30 CET:
youtube.com/live/ZWo6Q85...
chromewebstore.google.com/detail/sky-f...
chromewebstore.google.com/detail/sky-f...
A new blog post in which I list of all the tools and apps I've been using for work, plus all my opinions about them.
maria-antoniak.github.io/2024/12/30/o...
Featuring @kagi.com, @warp.dev, @paperpile.bsky.social, @are.na, Fantastical, @obsidian.md, Claude, and more.
- Switch with Nine Sols loaded
- iPad with Black Doves loaded
- laptop with data, python notebook, blog post draft loaded
- silk eye mask
- REI inflatable neck pillow
- vitamin C juice
- Journey to the East by Hermann Hesse
- compression socks
- many snacks
A new blog post in which I list of all the tools and apps I've been using for work, plus all my opinions about them.
maria-antoniak.github.io/2024/12/30/o...
Featuring @kagi.com, @warp.dev, @paperpile.bsky.social, @are.na, Fantastical, @obsidian.md, Claude, and more.