chrishoang.com
Feature similarity between the propagated and original perturbation intuitively matches object correspondence!
We can also repeat this over multiple frames for tracking
Feature similarity between the propagated and original perturbation intuitively matches object correspondence!
We can also repeat this over multiple frames for tracking
It even outperforms our prior work PooDLe (bsky.app/profile/meng...) without needing an external optical flow network!
It even outperforms our prior work PooDLe (bsky.app/profile/meng...) without needing an external optical flow network!
We also introduce learnable gating units on residual paths of forward predictors to remove bias towards the identity mapping
We also introduce learnable gating units on residual paths of forward predictors to remove bias towards the identity mapping
It predicts future dense latent features from visual encoders, conditioned on these motion latents (forward dynamics).
It predicts future dense latent features from visual encoders, conditioned on these motion latents (forward dynamics).
Inspired by this, we asked: can latent dynamics modeling learn useful representations of visual observations and their transformations over time, i.e. motion?
Inspired by this, we asked: can latent dynamics modeling learn useful representations of visual observations and their transformations over time, i.e. motion?
Our #ICLR2026 work Midway Network is the first to learn both recognition and motion understanding from videos via latent dynamics 🧵
Our #ICLR2026 work Midway Network is the first to learn both recognition and motion understanding from videos via latent dynamics 🧵