Ben Poole
benmpoole.bsky.social
Ben Poole
@benmpoole.bsky.social
research scientist at google deepmind.

phd in neural nonsense from stanford.

poolio.github.io
Information in observed images that predicts what should be in unobserved regions should be captured by these models (e.g. x.com/DotCSV/statu...). They can't create new information, but there are often more statistical hints than we as humans perceive (like the paper you linked).
x.com
x.com
December 4, 2024 at 8:20 PM
eh, perception as probabilistic inference is a well established theory of the mind
December 2, 2024 at 10:05 AM
This is the essence of visual perception. We don't see pixels, we experience the world behind an image. Our aim is to build AI systems that can achieve this same level of spatial intelligence.

Inferring the whole from fragments requires statistical priors for both humans and machines.
December 1, 2024 at 10:19 PM
unlike most work on image/video editing that is not guaranteed to "follow rules" of 3D consistency, our work builds an explicit virtual world that you can move a virtual camera through. we use a statistical learning-based system, but we distill it into a 3D model that follows at least some 3D rules
December 1, 2024 at 9:23 PM