Lightnews — Scholar-powered news

Mason Kamb

@masonkamb.bsky.social

150 followers 110 following 24 posts

Posts Replies Media Videos

Mason Kamb

@masonkamb.bsky.social

Our theory is tailored to models that have strong locality biases, such as CNNs. However, we find that our theory (bottom rows) is still moderately predictive for a simple diffusion model *with* self-Attention layers (top rows), which explicitly break equivariance/locality.

December 31, 2024 at 4:00 PM

Mason Kamb

@masonkamb.bsky.social

Diffusion models are notorious for getting the wrong numbers of fingers, legs, etc. Our theory is able to recapitulate this behavior, and provides for the first time a clear mechanistic explanation for these failures as a consequence of excessive locality.

December 31, 2024 at 4:00 PM

Mason Kamb

@masonkamb.bsky.social

This simple model of diffusion model creativity is remarkably predictive-- we find that, after calibrating a single time-dependent hyperparameter (the locality scale), we can replicate the behavior of trained fully-convolutional diffusion models on a case-by-case basis

December 31, 2024 at 4:00 PM

Mason Kamb

@masonkamb.bsky.social

Under optimal *equivariant+local* denoising, each pixel can be drawn towards *any* training patch from *anywhere* in the training set, rather than only the ones that are drawn from the same pixel location. We call this model the Equivariant Local Score (ELS) Machine.

December 31, 2024 at 4:00 PM

Mason Kamb

@masonkamb.bsky.social

Excited to finally share this work w/ @suryaganguli.bsky.social Tl;dr: we find the first closed-form analytical theory that replicates the outputs of the very simplest diffusion models, with median pixel wise r^2 values of 90%+. arxiv.org/abs/2412.20292

December 31, 2024 at 4:00 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news