Ashkan Mirzaei
ashmrz.bsky.social
Ashkan Mirzaei
@ashmrz.bsky.social
Research Scientist @Snap
Previously @UofT, @NVIDIAAI, @samsungresearch
Opinions are mine.

http://ashmrz.github.io
[7/9] 🧠 We use a camera token replacement trick for temporal consistency of the camera poses, temporal attention layers to share info over time, and a "Gaussian head" to predict shape, scale, opacity, and color offsets.
June 24, 2025 at 2:13 PM
[4/9]🧠How it works – Stage 1 (Generation):
We fuse spatial/temporal attentions into a transformer layer. This view-time attention lets our diffusion model reason across viewpoints and frames jointly, without extra parameters. Parameter-efficiency also leads to more stability.
June 24, 2025 at 2:13 PM
[2/9] We generate synchronized multi-view video grids, then lift them into 4D geometry using a fast feedforward network. The result is a set of Gaussian particles, ready for rendering, exploration, and editing.
June 24, 2025 at 2:13 PM
[1/9] 🚀 We introduce 4Real-Video-V2, a method that can generate 4D scenes from a simple text prompt, viewable from any angle at any moment in time. It’s fast, photorealistic, and works on full scenes. Here's how it works and why it matters. 👇

snap-research.github.io/4Real-Video-...
June 24, 2025 at 2:13 PM