Website : https://oussamazekri.fr
Blog : https://logb-research.github.io/
Wow, does this trick have a name?
Wow, does this trick have a name?
If you haven’t heard of him, check out his work : he’s one of the pioneers of operator learning and pushing this field to new heights!
If you haven’t heard of him, check out his work : he’s one of the pioneers of operator learning and pushing this field to new heights!
❤️ Work done during my 3-months internship at Imperial College!
A huge thanks to Nicolas Boullé (nboulle.github.io) for letting me work on a topic that interested me a lot during the internship.
❤️ Work done during my 3-months internship at Imperial College!
A huge thanks to Nicolas Boullé (nboulle.github.io) for letting me work on a topic that interested me a lot during the internship.
The cool part? We only need to invert a linear system, whose inverse is known in closed form! 🔥
The cool part? We only need to invert a linear system, whose inverse is known in closed form! 🔥
For example, this enables RLHF for discrete diffusion models, making alignment more flexible and powerful. ✅
For example, this enables RLHF for discrete diffusion models, making alignment more flexible and powerful. ✅
From this, we can reconstruct any policy gradient method for discrete diffusion models (e.g. PPO, GRPO etc...). 🚀
From this, we can reconstruct any policy gradient method for discrete diffusion models (e.g. PPO, GRPO etc...). 🚀
Instead, recent discrete diffusion models skip Z by learning ratios of probabilities. This forms the concrete score, which a neural network models efficiently!⚡
The challenge? Using this score network as a policy.
Instead, recent discrete diffusion models skip Z by learning ratios of probabilities. This forms the concrete score, which a neural network models efficiently!⚡
The challenge? Using this score network as a policy.
I invite you to take a look at the other contributions of the paper 🙂
I invite you to take a look at the other contributions of the paper 🙂