Yuyang Wang
yuyangw.bsky.social
Yuyang Wang
@yuyangw.bsky.social
Research Scientist at @Apple ML Research (MLR) | PhD from CMU | Simulate physical world with generative models 🌎🧬
7/n Finally, shout out to the great team at Apple MLR that makes this work possible, Anurag Ranjan, Josh Susskind, @itsbautistam.bsky.social!
December 6, 2024 at 7:15 PM
6/n More examples of sampled images and pointclouds. Note that we use the same training recipe to train generative models for images and 3D point clouds.
December 6, 2024 at 7:15 PM
5/n ASFT decodes each coordinate-value pair independently, allowing resolution to change during inference. Our model trained at ImageNet-256 can trivially generate at resolution 2k. Similarly, a model trained for image2point with 16k points can generate much denser pointclouds (e.g., 128k points)
December 6, 2024 at 7:15 PM
4/n ASFT achieves very strong performance on 3D point cloud generation. On ShapeNet, ASFT outperforms SOTA latent diffusion model LION. On large-scale image2point generation on Objaverse, ASFT also shows very strong results over previous methods like CLAY.
December 6, 2024 at 7:15 PM
3/n On Imagenet-256, ASFT achieves performance comparable to baselines that apply architectures specifically designed for image generation.
December 6, 2024 at 7:15 PM
2/n ASFT works directly on ambient space by modeling different data domains as sets of coordinate-value maps. We introduce a conditionally independent point-wise training objective that enables ASFT to make predictions continuously in coordinate space (like in neural fields).
December 6, 2024 at 7:15 PM