Willis (Nanye) Ma
nanye-ma.bsky.social
Willis (Nanye) Ma
@nanye-ma.bsky.social
PhD at NYU Courant | undergrad at NYU
Working on this project is a truly unforgettable experience! Thanks to all my awesome collaborators
Shangyuan Tong, Haolin Jia, Hexiang Hu, Yu-Chuan Su, Mingda Zhang, XuanYang, Yandong Li, Xuhui Jia, and advisors Tommi Jaakkola and @saining.bsky.social! [n/n]
January 17, 2025 at 4:50 PM
For more technical details, please checkout our paper and website! If you have any question related to our work, feel free to contact!

ArXiv: arxiv.org/pdf/2501.09732
Project Page: inference-scale-diffusion.github.io
arxiv.org
January 17, 2025 at 4:50 PM
Lastly, we examine how scaling inference-time compute benefits smaller diffusion models.
These results indicate that substantial training costs can be partially offset by modest inference-time compute, enabling higher-quality samples more efficiently. [6/n]
January 17, 2025 at 4:50 PM
We then proceed to examine the capability of the search framework in text-conditioned generation task.
With the 12B FLUX.1-dev model on DrawBench, searching with all verifiers improves sample quality, while again specific improvement behaviors largely vary across different setups. [5/n]
January 17, 2025 at 4:50 PM
Our search framework consists of two components: verifiers to provide feedback, and algorithms to find better noise candidates.
On ImageNet with SiT-XL, different combinations of verifiers and algorithms are observed to have very different scaling behaviors. [4/n]
January 17, 2025 at 4:50 PM
From "cherry-picking", we know that some noises are better than others.
This suggests pushing the inference-time scaling limit by investing compute in searching for better noises.
Then, it's natural to ask: how do we know which sampling noises are good, and how do we search for such noises? [3/n]
January 17, 2025 at 4:50 PM
Diffusion models have the flexibility to allocate varied compute at inference time by adjusting the number of denoising steps. Yet, the performance gain from this plateaus at a few dozens.
Therefore, for diffusion models to scale more at inference time, a new framework needs to be designed.[2/n]
January 17, 2025 at 4:50 PM