🚨Another surge in progress for reinforcement learning this week, provided by Beyond the 80/20 Rule, ProRL, and AReal all pushing the boundaries.🚀
Check out the top 10 papers for the week👇
- Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Check out the top 10 papers for the week👇
- Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
June 10, 2025 at 4:30 PM
🚨Another surge in progress for reinforcement learning this week, provided by Beyond the 80/20 Rule, ProRL, and AReal all pushing the boundaries.🚀
Check out the top 10 papers for the week👇
- Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Check out the top 10 papers for the week👇
- Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
🚨There’s a new ceiling for efficient reasoning with the rise of Learning to Reason without External Rewards, along with AgriFM pushing the boundaries of AI to even agriculture🚀
Check out the top 10 papers for the week👇
- Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers
Check out the top 10 papers for the week👇
- Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers
June 2, 2025 at 4:20 PM
🚨There’s a new ceiling for efficient reasoning with the rise of Learning to Reason without External Rewards, along with AgriFM pushing the boundaries of AI to even agriculture🚀
Check out the top 10 papers for the week👇
- Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers
Check out the top 10 papers for the week👇
- Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers
🚨Clear your schedule for a recap of a tremendous week featuring DeepSeek-V3 along with BLIP3-o’s improvements in multimodal architecture 🚀
Check out the top 10 papers for the week👇
- Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Check out the top 10 papers for the week👇
- Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
May 19, 2025 at 4:34 PM
🚨Clear your schedule for a recap of a tremendous week featuring DeepSeek-V3 along with BLIP3-o’s improvements in multimodal architecture 🚀
Check out the top 10 papers for the week👇
- Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Check out the top 10 papers for the week👇
- Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
🚨Bright week for agents and representation learning, notably including X-Fusion’s remarkable progress in multimodal capabilities 🚀
Check out the top 10 papers for the week👇
- From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review
Check out the top 10 papers for the week👇
- From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review
May 3, 2025 at 9:14 PM
🚨Bright week for agents and representation learning, notably including X-Fusion’s remarkable progress in multimodal capabilities 🚀
Check out the top 10 papers for the week👇
- From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review
Check out the top 10 papers for the week👇
- From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review
🚨Don’t miss out on this high-impact week for AI and reinforcement learning, featuring greedy agents, test-time RL, and a powerful new benchmark for LLM physical reasoning 🚀
Check out the top 10 papers for the week👇
- TTRL: Test-Time Reinforcement Learning
Check out the top 10 papers for the week👇
- TTRL: Test-Time Reinforcement Learning
April 29, 2025 at 2:52 AM
🚨Don’t miss out on this high-impact week for AI and reinforcement learning, featuring greedy agents, test-time RL, and a powerful new benchmark for LLM physical reasoning 🚀
Check out the top 10 papers for the week👇
- TTRL: Test-Time Reinforcement Learning
Check out the top 10 papers for the week👇
- TTRL: Test-Time Reinforcement Learning
🚨Don’t miss this week’s immense developments in optimizing reasoning, along with advanced visual embedding capabilities.🚀
Check out the top 10 papers for the week👇
- Reasoning Models Can Be Effective Without Thinking
Check out the top 10 papers for the week👇
- Reasoning Models Can Be Effective Without Thinking
April 19, 2025 at 8:16 PM
🚨Don’t miss this week’s immense developments in optimizing reasoning, along with advanced visual embedding capabilities.🚀
Check out the top 10 papers for the week👇
- Reasoning Models Can Be Effective Without Thinking
Check out the top 10 papers for the week👇
- Reasoning Models Can Be Effective Without Thinking
🚨Huge week for video generation and multimodal models, with detailed one-minute video generation and more efficient approaches to multimodality 🚀
Check out the top 10 papers for the week👇
- Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Check out the top 10 papers for the week👇
- Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
April 13, 2025 at 2:21 AM
🚨Huge week for video generation and multimodal models, with detailed one-minute video generation and more efficient approaches to multimodality 🚀
Check out the top 10 papers for the week👇
- Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Check out the top 10 papers for the week👇
- Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Introducing Deep Research for arXiv
Ask questions like 'What are the latest breakthroughs in RL fine-tuning?' and get comprehensive lit reviews with trending papers automatically included
Turn hours of literature searches into seconds with AI-powered research context ⚡
Ask questions like 'What are the latest breakthroughs in RL fine-tuning?' and get comprehensive lit reviews with trending papers automatically included
Turn hours of literature searches into seconds with AI-powered research context ⚡
April 8, 2025 at 5:40 PM
Introducing Deep Research for arXiv
Ask questions like 'What are the latest breakthroughs in RL fine-tuning?' and get comprehensive lit reviews with trending papers automatically included
Turn hours of literature searches into seconds with AI-powered research context ⚡
Ask questions like 'What are the latest breakthroughs in RL fine-tuning?' and get comprehensive lit reviews with trending papers automatically included
Turn hours of literature searches into seconds with AI-powered research context ⚡
Introducing Llama 4 for understanding arXiv papers 🚀
Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references
Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references
April 6, 2025 at 8:42 PM
Introducing Llama 4 for understanding arXiv papers 🚀
Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references
Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references
🚨Notable week for scaling, with shocking improvements in visual representation learning as well as reward modeling vastly expanding LLM capabilities 🚀
Check out the top 10 papers for the week👇
- What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
Check out the top 10 papers for the week👇
- What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
April 5, 2025 at 9:46 PM
🚨Notable week for scaling, with shocking improvements in visual representation learning as well as reward modeling vastly expanding LLM capabilities 🚀
Check out the top 10 papers for the week👇
- What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
Check out the top 10 papers for the week👇
- What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
Reinforcement learning for retrieval-augmented reasoning 🚀
Baichuan introduces ReSearch, an RL framework that teaches LLMs to reason with search from scratch
Outperforms RAG baselines
No supervised data on reasoning steps
Simple & generalizable
Trending #1 on alphaXiv 📈
Baichuan introduces ReSearch, an RL framework that teaches LLMs to reason with search from scratch
Outperforms RAG baselines
No supervised data on reasoning steps
Simple & generalizable
Trending #1 on alphaXiv 📈
March 31, 2025 at 5:54 PM
Reinforcement learning for retrieval-augmented reasoning 🚀
Baichuan introduces ReSearch, an RL framework that teaches LLMs to reason with search from scratch
Outperforms RAG baselines
No supervised data on reasoning steps
Simple & generalizable
Trending #1 on alphaXiv 📈
Baichuan introduces ReSearch, an RL framework that teaches LLMs to reason with search from scratch
Outperforms RAG baselines
No supervised data on reasoning steps
Simple & generalizable
Trending #1 on alphaXiv 📈
🚨Major week for reinforcement learning, with important strides in improving parameter tuning and reasoning capabilities paving the road for smarter LLMs 🚀
Check out the top 10 papers for the week👇
- Reasoning to Learn from Latent Thoughts
Check out the top 10 papers for the week👇
- Reasoning to Learn from Latent Thoughts
March 29, 2025 at 6:08 PM
🚨Major week for reinforcement learning, with important strides in improving parameter tuning and reasoning capabilities paving the road for smarter LLMs 🚀
Check out the top 10 papers for the week👇
- Reasoning to Learn from Latent Thoughts
Check out the top 10 papers for the week👇
- Reasoning to Learn from Latent Thoughts
🚀This week was huge for foundation models, from generalist humanoid robots to multimodal LLMs that learn from human preferences and negative examples – here are the top 10 papers for the week🚨
- DAPO: An Open-Source LLM Reinforcement Learning System at Scale
- DAPO: An Open-Source LLM Reinforcement Learning System at Scale
March 23, 2025 at 2:44 PM
🚀This week was huge for foundation models, from generalist humanoid robots to multimodal LLMs that learn from human preferences and negative examples – here are the top 10 papers for the week🚨
- DAPO: An Open-Source LLM Reinforcement Learning System at Scale
- DAPO: An Open-Source LLM Reinforcement Learning System at Scale
We used Mistral OCR with Claude 3.7 to create blog-style overviews for arXiv papers
Generate beautiful research blogs with figures, key insights, and clear explanations from the paper with just one click
Understand papers in minutes - not hours
Generate beautiful research blogs with figures, key insights, and clear explanations from the paper with just one click
Understand papers in minutes - not hours
March 14, 2025 at 6:41 PM
We used Mistral OCR with Claude 3.7 to create blog-style overviews for arXiv papers
Generate beautiful research blogs with figures, key insights, and clear explanations from the paper with just one click
Understand papers in minutes - not hours
Generate beautiful research blogs with figures, key insights, and clear explanations from the paper with just one click
Understand papers in minutes - not hours
arXiv has been instrumental in advancing open CS research for decades -- we highly encourage everyone to support arXiv for Cornell Giving Day today!
When you support arXiv, you support collaboration & community building. 🤝🛠️
arXivLabs is how researchers launch new features on arXiv. Working together to make research better - all your faves are arXivLabs: @hf.co @alphaxiv.org @litmaps.com & more.
https://givingday.cornell.edu/campaigns/arxiv
arXivLabs is how researchers launch new features on arXiv. Working together to make research better - all your faves are arXivLabs: @hf.co @alphaxiv.org @litmaps.com & more.
https://givingday.cornell.edu/campaigns/arxiv
March 14, 2025 at 2:14 AM
arXiv has been instrumental in advancing open CS research for decades -- we highly encourage everyone to support arXiv for Cornell Giving Day today!
Reposted by alphaXiv
Saw that @alphaxiv.org now has an AI feature to create paper overviews (using Mistral OCR and Claude), so I created one for new DES paper 2503.06712. Includes figures, key findings, etc. Once created, it's publicly available to everyone at the overview link. alphaXiv also has a cosmology community.
Dark Energy Survey: implications for cosmological expansion models from the final DES Baryon Acoustic Oscillation and Supernova data | alphaXiv
View recent discussion. Abstract: The Dark Energy Survey (DES) recently released the final results of its two
principal probes of the expansion history: Type Ia Supernovae (SNe) and
Baryonic Acoustic ...
www.alphaxiv.org
March 12, 2025 at 4:53 PM
Saw that @alphaxiv.org now has an AI feature to create paper overviews (using Mistral OCR and Claude), so I created one for new DES paper 2503.06712. Includes figures, key findings, etc. Once created, it's publicly available to everyone at the overview link. alphaXiv also has a cosmology community.
🚀This week, AI is soaring to new heights—whether it’s evolving language models through nature-inspired techniques, mastering video generation at scale, or crafting smarter, self-improving agents that think like swarms.🚨
- Nature-Inspired Population-Based Evolution of Large Language Models
- Nature-Inspired Population-Based Evolution of Large Language Models
March 9, 2025 at 11:10 PM
🚀This week, AI is soaring to new heights—whether it’s evolving language models through nature-inspired techniques, mastering video generation at scale, or crafting smarter, self-improving agents that think like swarms.🚨
- Nature-Inspired Population-Based Evolution of Large Language Models
- Nature-Inspired Population-Based Evolution of Large Language Models
🚀This week, AI is stepping into new dimensions—becoming co-scientists, sculpting 3D avatars, and blending cloud and on-device models into a seamless dance of creativity and efficiency.🚨
- Towards an AI co-scientist
- Towards an AI co-scientist
March 1, 2025 at 8:21 PM
🚀This week, AI is stepping into new dimensions—becoming co-scientists, sculpting 3D avatars, and blending cloud and on-device models into a seamless dance of creativity and efficiency.🚨
- Towards an AI co-scientist
- Towards an AI co-scientist
🚀This week was huge for AI—whether through sparse attention, mastering long-context reasoning, or even creating million-dollar software engineering gigs, it's all about smarter, more efficient models shaping a dynamic future.🚨
February 22, 2025 at 9:08 PM
🚀This week was huge for AI—whether through sparse attention, mastering long-context reasoning, or even creating million-dollar software engineering gigs, it's all about smarter, more efficient models shaping a dynamic future.🚨
Top Trending Papers on alphaXiv this week!📈
🚀From teaching themselves to predict the future to solving strategic social deduction, AI this week is discovering the hidden geometry of prompts, scaling reasoning, and rethinking what’s possible with less.🚨
🚀From teaching themselves to predict the future to solving strategic social deduction, AI this week is discovering the hidden geometry of prompts, scaling reasoning, and rethinking what’s possible with less.🚨
February 16, 2025 at 9:08 PM
Top Trending Papers on alphaXiv this week!📈
🚀From teaching themselves to predict the future to solving strategic social deduction, AI this week is discovering the hidden geometry of prompts, scaling reasoning, and rethinking what’s possible with less.🚨
🚀From teaching themselves to predict the future to solving strategic social deduction, AI this week is discovering the hidden geometry of prompts, scaling reasoning, and rethinking what’s possible with less.🚨
1997: Deep Blue defeats Kasparov at chess
2016: AlphaGo masters the game of Go
2025: Stanford researchers crack Among Us
Trending on alphaXiv 📈
Remarkable new work trains LLMs to master strategic social deduction through multi-agent RL, doubling win rates over standard RL.
2016: AlphaGo masters the game of Go
2025: Stanford researchers crack Among Us
Trending on alphaXiv 📈
Remarkable new work trains LLMs to master strategic social deduction through multi-agent RL, doubling win rates over standard RL.
February 15, 2025 at 8:44 PM
1997: Deep Blue defeats Kasparov at chess
2016: AlphaGo masters the game of Go
2025: Stanford researchers crack Among Us
Trending on alphaXiv 📈
Remarkable new work trains LLMs to master strategic social deduction through multi-agent RL, doubling win rates over standard RL.
2016: AlphaGo masters the game of Go
2025: Stanford researchers crack Among Us
Trending on alphaXiv 📈
Remarkable new work trains LLMs to master strategic social deduction through multi-agent RL, doubling win rates over standard RL.
We used DeepSeek-V3 to classify every AI paper on arXiv by topic (agents, VLMs, etc) 🚀
Now you can instantly filter to see what's trending in each area 🚨
Now you can instantly filter to see what's trending in each area 🚨
February 14, 2025 at 12:27 AM
We used DeepSeek-V3 to classify every AI paper on arXiv by topic (agents, VLMs, etc) 🚀
Now you can instantly filter to see what's trending in each area 🚨
Now you can instantly filter to see what's trending in each area 🚨
Top Trending Papers on alphaXiv this week!📈
🚀This week, AI is leveling up—from solving Olympiad geometry with AlphaGeometry2 to generating motion with VideoJAM, while also mastering the art of reasoning and adversarial resilience with tools like DeepRAG and LIMO. 🚨
🚀This week, AI is leveling up—from solving Olympiad geometry with AlphaGeometry2 to generating motion with VideoJAM, while also mastering the art of reasoning and adversarial resilience with tools like DeepRAG and LIMO. 🚨
February 9, 2025 at 8:07 PM
Top Trending Papers on alphaXiv this week!📈
🚀This week, AI is leveling up—from solving Olympiad geometry with AlphaGeometry2 to generating motion with VideoJAM, while also mastering the art of reasoning and adversarial resilience with tools like DeepRAG and LIMO. 🚨
🚀This week, AI is leveling up—from solving Olympiad geometry with AlphaGeometry2 to generating motion with VideoJAM, while also mastering the art of reasoning and adversarial resilience with tools like DeepRAG and LIMO. 🚨
Reposted by alphaXiv
I'm on @alphaxiv.org now! See you in the comments!
Fancy Research Profile - michelle.alphaxiv.io
Comment-able papers - www.alphaxiv.org/profile/6793...
Thanks to @ml-collective.bsky.social for sharing this platform!
Fancy Research Profile - michelle.alphaxiv.io
Comment-able papers - www.alphaxiv.org/profile/6793...
Thanks to @ml-collective.bsky.social for sharing this platform!
Michelle Lin
<p>I am a MSc student at the University of Montreal & Mila - Quebec AI Institute.</p><p>Prior, I completed my Bachelors in Computer Science at McGill University, where I was also a Research Assist...
michelle.alphaxiv.io
January 24, 2025 at 7:06 PM
I'm on @alphaxiv.org now! See you in the comments!
Fancy Research Profile - michelle.alphaxiv.io
Comment-able papers - www.alphaxiv.org/profile/6793...
Thanks to @ml-collective.bsky.social for sharing this platform!
Fancy Research Profile - michelle.alphaxiv.io
Comment-able papers - www.alphaxiv.org/profile/6793...
Thanks to @ml-collective.bsky.social for sharing this platform!