- they train the model to gradually unmask tokens at arbitrary locations in the sequence
I find it counter-intuitive that this would work at all...
- they train the model to gradually unmask tokens at arbitrary locations in the sequence
I find it counter-intuitive that this would work at all...
@apolinario.bsky.social mentioned this demo: huggingface.co/spaces/multi...
(video from Apolinario. It's great to understand how the token sequence is generated!)
@apolinario.bsky.social mentioned this demo: huggingface.co/spaces/multi...
(video from Apolinario. It's great to understand how the token sequence is generated!)
arxiv.org/abs/2502.09992
Significant progress towards language diffusion models. Reportedly on par with LLaMA3 on many benchmarks.
arxiv.org/abs/2502.09992
Significant progress towards language diffusion models. Reportedly on par with LLaMA3 on many benchmarks.
It's a very long paper, but reasonably accessible to newcomers... if you're interested in that space, I recommend you start with reading their conclusion and discussion on page 59:
It's a very long paper, but reasonably accessible to newcomers... if you're interested in that space, I recommend you start with reading their conclusion and discussion on page 59:
comfyanonymous.github.io/ComfyUI_exam...
comfyanonymous.github.io/ComfyUI_exam...