Video models might well become vision foundation models.
Video models might well become vision foundation models.
In our new preprint, we show that generative video models can solve a wide range of tasks across the entire vision stack without being explicitly trained for it.
🌐 video-zero-shot.github.io
1/n
In our new preprint, we show that generative video models can solve a wide range of tasks across the entire vision stack without being explicitly trained for it.
🌐 video-zero-shot.github.io
1/n