Lightnews — Scholar-powered news

Willis (Nanye) Ma

@nanye-ma.bsky.social

16 followers 12 following 8 posts

PhD at NYU Courant | undergrad at NYU

Posts Replies Media Videos

Willis (Nanye) Ma

@nanye-ma.bsky.social

Working on this project is a truly unforgettable experience! Thanks to all my awesome collaborators
Shangyuan Tong, Haolin Jia, Hexiang Hu, Yu-Chuan Su, Mingda Zhang, XuanYang, Yandong Li, Xuhui Jia, and advisors Tommi Jaakkola and @saining.bsky.social! [n/n]

January 17, 2025 at 4:50 PM

Willis (Nanye) Ma

@nanye-ma.bsky.social

For more technical details, please checkout our paper and website! If you have any question related to our work, feel free to contact!

ArXiv: arxiv.org/pdf/2501.09732
Project Page: inference-scale-diffusion.github.io

arxiv.org

January 17, 2025 at 4:50 PM

Willis (Nanye) Ma

@nanye-ma.bsky.social

Lastly, we examine how scaling inference-time compute benefits smaller diffusion models.
These results indicate that substantial training costs can be partially offset by modest inference-time compute, enabling higher-quality samples more efficiently. [6/n]

January 17, 2025 at 4:50 PM

Willis (Nanye) Ma

@nanye-ma.bsky.social

We then proceed to examine the capability of the search framework in text-conditioned generation task.
With the 12B FLUX.1-dev model on DrawBench, searching with all verifiers improves sample quality, while again specific improvement behaviors largely vary across different setups. [5/n]

January 17, 2025 at 4:50 PM

Willis (Nanye) Ma

@nanye-ma.bsky.social

Our search framework consists of two components: verifiers to provide feedback, and algorithms to find better noise candidates.
On ImageNet with SiT-XL, different combinations of verifiers and algorithms are observed to have very different scaling behaviors. [4/n]

January 17, 2025 at 4:50 PM

Willis (Nanye) Ma

@nanye-ma.bsky.social

From "cherry-picking", we know that some noises are better than others.
This suggests pushing the inference-time scaling limit by investing compute in searching for better noises.
Then, it's natural to ask: how do we know which sampling noises are good, and how do we search for such noises? [3/n]

January 17, 2025 at 4:50 PM

Willis (Nanye) Ma

@nanye-ma.bsky.social

Diffusion models have the flexibility to allocate varied compute at inference time by adjusting the number of denoising steps. Yet, the performance gain from this plateaus at a few dozens.
Therefore, for diffusion models to scale more at inference time, a new framework needs to be designed.[2/n]

January 17, 2025 at 4:50 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news