Slava Elizarov
selizarov.bsky.social
Slava Elizarov
@selizarov.bsky.social
Senior research scientist at Unity | Generative models, Computer Graphics
P.P.S. We recommend you check out Omages (omages.github.io), an awesome concurrent work that also explores geometry images (called "Omages") for 3D generation. We believe GIMs have a bright future in deep learning — let’s bring it forward together 🚀
Object Images 64x
An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion
omages.github.io
December 6, 2024 at 4:25 PM
P.S.
Thanks to Simon Donné, @ciararowles.bsky.social, Shimon Vainer, Dante De Nigris, and Konstantin Kutsy for being such an awesome team!
Additional thanks to Alexander Demidko and Dr. Lev Melnikovsky from the Weizmann Institute for all the insightful discussions we had during this project
December 6, 2024 at 4:25 PM
So whether you’re looking for speed, flexibility, or eco-friendly workflows, Geometry Image Diffusion has you covered. Got curious? Dive into our paper to learn more!
Paper: arxiv.org/abs/2409.03718
Code: coming soon
Site: unity-research.github.io/Geometry-Ima...
(10/10)
Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation
Generating high-quality 3D objects from textual descriptions remains a challenging problem due to computational cost, the scarcity of 3D data, and complex 3D representations. We introduce Geometry Ima...
arxiv.org
December 6, 2024 at 4:25 PM
And we’re not just saving forests. The assets you generate with Geometry Image Diffusion are free from baked-in lighting. Re-light them in any environment to fit your scene and save some energy while you’re at it! 💡 (9/10)
December 6, 2024 at 4:25 PM
(But I must admit that it’s hard to resist generating thousand barrels because they’re all so different)
December 6, 2024 at 4:25 PM
Why produce a thousand barrels? Let’s save the forest! 🌳 Just edit the one you’ve already generated (8/10)
December 6, 2024 at 4:25 PM
Want an unexpected twist? The generated 3D objects come with meaningful, separable parts, making them easy to edit and manipulate (7/10)
December 6, 2024 at 4:25 PM
Our assets can be easily triangulated by connecting neighboring pixels, and come unwrapped with textures included — no waste here ♻️ (6/10)
December 6, 2024 at 4:25 PM
(Prompts: Lovecraftian teacup with a tentacle instead of the handle; A steampunk airplane; An avocado-shaped chair)
December 6, 2024 at 4:25 PM
Our model is trained on a 100k subset of Objaverse — smaller than what’s typically used for 3D generation. Yet, it generalizes well across a wide range of prompts (5/10)
December 6, 2024 at 4:25 PM
With a frozen Stable Diffusion model for textures and its trainable copy for geometry, the geometry model can tap into SD’s powerful natural image prior (4/10)
December 6, 2024 at 4:25 PM
At the heart of our method is Collaborative Control. It allows two models to work together — one for generating the geometry image and another for creating textures — all while sharing information to ensure everything lines up perfectly 🤝(3/10)
December 6, 2024 at 4:25 PM
The secret? We use geometry images, which are essentially 2D representations of 3D surfaces 🖼️ (think of GIMs as UV maps’ close cousins) This lets us recycle existing Text-to-Image models like Stable Diffusion, instead of building complex 3D architectures from scratch (2/10)
December 6, 2024 at 4:25 PM