Website : https://oussamazekri.fr
Blog : https://logb-research.github.io/
The cool part? We only need to invert a linear system, whose inverse is known in closed form! 🔥
The cool part? We only need to invert a linear system, whose inverse is known in closed form! 🔥
For example, this enables RLHF for discrete diffusion models, making alignment more flexible and powerful. ✅
For example, this enables RLHF for discrete diffusion models, making alignment more flexible and powerful. ✅
From this, we can reconstruct any policy gradient method for discrete diffusion models (e.g. PPO, GRPO etc...). 🚀
From this, we can reconstruct any policy gradient method for discrete diffusion models (e.g. PPO, GRPO etc...). 🚀
Instead, recent discrete diffusion models skip Z by learning ratios of probabilities. This forms the concrete score, which a neural network models efficiently!⚡
The challenge? Using this score network as a policy.
Instead, recent discrete diffusion models skip Z by learning ratios of probabilities. This forms the concrete score, which a neural network models efficiently!⚡
The challenge? Using this score network as a policy.
But what happens when we swap autoregressive generation for discrete diffusion, a rising architecture promising faster & more controllable LLMs?
Introducing SEPO !
📑 arxiv.org/pdf/2502.01384
🧵👇
But what happens when we swap autoregressive generation for discrete diffusion, a rising architecture promising faster & more controllable LLMs?
Introducing SEPO !
📑 arxiv.org/pdf/2502.01384
🧵👇
The frequentist approach, which is minimax optimal, achieves O(d/N). (see Wolfer et al., 2019, arxiv.org/pdf/1902.00080).
This makes it particularly efficient for MC with a large number of states! 🌟
The frequentist approach, which is minimax optimal, achieves O(d/N). (see Wolfer et al., 2019, arxiv.org/pdf/1902.00080).
This makes it particularly efficient for MC with a large number of states! 🌟
Tested and validated on recent LLMs!
Tested and validated on recent LLMs!
The results are pretty exciting ! 😄
The results are pretty exciting ! 😄