Our mission is to contribute to the advancement of AI research and understand the computational requirements of intelligence.
This month, we cover:
➡️ FlowRL
➡️ Soft Tokens, Hard Truths
➡️ Set Block Decoding is a Language Model Inference Accelerator
➡️ Turning Recurring LLM Reasoning into Concise Behaviors
🧵
This month, we cover:
➡️ FlowRL
➡️ Soft Tokens, Hard Truths
➡️ Set Block Decoding is a Language Model Inference Accelerator
➡️ Turning Recurring LLM Reasoning into Concise Behaviors
🧵
This month, we cover:
➡️ FlowRL
➡️ Soft Tokens, Hard Truths
➡️ Set Block Decoding is a Language Model Inference Accelerator
➡️ Turning Recurring LLM Reasoning into Concise Behaviors
🧵
➡️ ADMIRE-BayesOpt
➡️ Guiding Diffusion Models with RL for Stable Molecule Generation
➡️ Graph-R1
🧵
➡️ ADMIRE-BayesOpt
➡️ Guiding Diffusion Models with RL for Stable Molecule Generation
➡️ Graph-R1
🧵
🧠 Subliminal Learning: Language Models Transmit Behavioral Traits via Hidden Signals in Data
💽 Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
📊 DataRater: Meta-Learned Dataset Curation
🧵 ⬇️
🧠 Subliminal Learning: Language Models Transmit Behavioral Traits via Hidden Signals in Data
💽 Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
📊 DataRater: Meta-Learned Dataset Curation
🧵 ⬇️
What should you do 🤔... quantise to NF4? 🧵
What should you do 🤔... quantise to NF4? 🧵
It's called Optimal Formats for Weight Quantisation and has just hit arXiv.
1/6
It's called Optimal Formats for Weight Quantisation and has just hit arXiv.
1/6
➡️ Motion Prompting: Controlling Video Generation with Motion Trajectories
➡️ Inference-Time Scaling for Generalist Reward Modeling
➡️ M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models!
🧵
➡️ Motion Prompting: Controlling Video Generation with Motion Trajectories
➡️ Inference-Time Scaling for Generalist Reward Modeling
➡️ M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models!
🧵
Titans, Evolving Deeper LLM Thinking, Transformer-Squared and the recent DeepSeek technical reports! 🧵
graphcore-research.github.io/papers-of-th...
Titans, Evolving Deeper LLM Thinking, Transformer-Squared and the recent DeepSeek technical reports! 🧵
graphcore-research.github.io/papers-of-th...
The Byte Latent Transformer, Large Concept Models, Memory Layers & Phi-4 — all grouped under the title "Spend Your FLOPs Wisely". Here's our take (🧵)
graphcore-research.github.io/papers-of-th...
The Byte Latent Transformer, Large Concept Models, Memory Layers & Phi-4 — all grouped under the title "Spend Your FLOPs Wisely". Here's our take (🧵)
graphcore-research.github.io/papers-of-th...