https://wimmerth.github.io
💻 Code: github.com/wimmerth/anyup
Great collaboration with Prune Truong, Marie-Julie Rakotosaona, Michael Oechsle, Federico Tombari, Bernt Schiele, and @janericlenssen.bsky.social!
CC: @cvml.mpi-inf.mpg.de @mpi-inf.mpg.de
💻 Code: github.com/wimmerth/anyup
Great collaboration with Prune Truong, Marie-Julie Rakotosaona, Michael Oechsle, Federico Tombari, Bernt Schiele, and @janericlenssen.bsky.social!
CC: @cvml.mpi-inf.mpg.de @mpi-inf.mpg.de
In our experiments, we show that it matches encoder-specific upsamplers and that trends between different model sizes are preserved.
In our experiments, we show that it matches encoder-specific upsamplers and that trends between different model sizes are preserved.
Importantly, the upsampled features also stay faithful to the input feature space, as we show in experiments with pre-trained DINOv2 probes.
Importantly, the upsampled features also stay faithful to the input feature space, as we show in experiments with pre-trained DINOv2 probes.
Together with window attention-based upsampling, a new training pipeline and consistency regularization, we achieve SOTA results.
Together with window attention-based upsampling, a new training pipeline and consistency regularization, we achieve SOTA results.
However, their features are of low resolution and many applications need pixel-wise features instead.
AnyUp can upsample any features of any dimensionality to any resolution.
However, their features are of low resolution and many applications need pixel-wise features instead.
AnyUp can upsample any features of any dimensionality to any resolution.
We do the simplest thing: just train a model (e.g., a next-token predictor) on all elements of the concatenated dataset [X,Y,Z].
You end up with a better model of dataset X than if you had trained on X alone!
6/9
We do the simplest thing: just train a model (e.g., a next-token predictor) on all elements of the concatenated dataset [X,Y,Z].
You end up with a better model of dataset X than if you had trained on X alone!
6/9
📄Paper: arxiv.org/pdf/2506.05312
💻Code: github.com/odunkel/DIY-SC
🤗Demo: huggingface.co/spaces/odunk...
Great collaboration with @wimmerthomas.bsky.social , Christian Theobalt, Christian Rupprecht, and @adamkortylewski.bsky.social ! [6/6]
wimmerth.github.io/gaussians2li...
wimmerth.github.io/gaussians2li...
Note, that since the time I worked on this, open-sourced video diffusion models have improved significantly, which will directly improve the results of this method as well.
🧵⬇️
Note, that since the time I worked on this, open-sourced video diffusion models have improved significantly, which will directly improve the results of this method as well.
🧵⬇️
With limited resources, we can't fine-tune or retrain a VDM to be pose-conditioned. Thus, we propose a zero-shot technique to generate more 3D-consistent videos!
🧵⬇️
With limited resources, we can't fine-tune or retrain a VDM to be pose-conditioned. Thus, we propose a zero-shot technique to generate more 3D-consistent videos!
🧵⬇️
Instead, we propose to employ several pre-trained 2D models to directly lift motion from tracked points in the generated videos to 3D Gaussians.
🧵⬇️
Instead, we propose to employ several pre-trained 2D models to directly lift motion from tracked points in the generated videos to 3D Gaussians.
🧵⬇️