This should also be called "l'appel de la catastrophe" ...
This should also be called "l'appel de la catastrophe" ...
arxiv.org/abs/2507.01667
openaccess.thecvf.com/content_cvpr...
arxiv.org/abs/2307.16710
arxiv.org/abs/2507.01667
openaccess.thecvf.com/content_cvpr...
arxiv.org/abs/2307.16710
arxiv.org/abs/2402.14817
(Cameras as rays by CMU, ICLR 2024)
arxiv.org/abs/2402.14817
(Cameras as rays by CMU, ICLR 2024)
arxiv.org/abs/2504.14151
(Nice paper by FAIR/Meta, but I think the Figure could have had some more details on where Q,K,Vs go)
arxiv.org/abs/2504.14151
(Nice paper by FAIR/Meta, but I think the Figure could have had some more details on where Q,K,Vs go)
1. When architecture information is combined with a Figure showing what actually happens with the data:
1. When architecture information is combined with a Figure showing what actually happens with the data:
(between Lyon and Grenoble).
(between Lyon and Grenoble).
😂
😂
R. Guerrier, @adamharley.bsky.social, @dimadamen.bsky.social
Bristol/Meta
rhodriguerrier.github.io/PointSt3R/
R. Guerrier, @adamharley.bsky.social, @dimadamen.bsky.social
Bristol/Meta
rhodriguerrier.github.io/PointSt3R/
TLDR: the authors suggest that rendering and physics simulation are not dissociated; both are necessary for world models.
arxiv.org/abs/2510.208...
Luo et. al
TLDR: the authors suggest that rendering and physics simulation are not dissociated; both are necessary for world models.
arxiv.org/abs/2510.208...
Luo et. al
arxiv.org/abs/2510.19266
Wang et al., NUS, UTA
arxiv.org/abs/2510.19266
Wang et al., NUS, UTA
- hierarchical,
- backwards features with cross-attention.
arxiv.org/abs/2510.05558
C. Hoang, @mengyer.bsky.social
NYU
- hierarchical,
- backwards features with cross-attention.
arxiv.org/abs/2510.05558
C. Hoang, @mengyer.bsky.social
NYU
Today: the Ripley buffer.
(Repost from the old site from Oct. 2024, but I just rewatched Aliens for the probably 15th or so time)
Today: the Ripley buffer.
(Repost from the old site from Oct. 2024, but I just rewatched Aliens for the probably 15th or so time)
Imagine a 747 at full thrust 24h/day in your neighborhood.
Imagine a 747 at full thrust 24h/day in your neighborhood.
Interestingly, even from training with relative pose only, we can show through probing experiments that occupancy maps are encoded in the maintained memory.
9/9
Interestingly, even from training with relative pose only, we can show through probing experiments that occupancy maps are encoded in the maintained memory.
9/9
8/9
8/9
We train on sequences of length T=100 and show generalization up to T=800 and T=1000, which we think is yet unheard of.
7/9
We train on sequences of length T=100 and show generalization up to T=800 and T=1000, which we think is yet unheard of.
7/9
- relative pose estimation compares the camera poses of two images
- Kinaema estimates the relative pose between an image and agent memory, where memory holds the scene and the current agent position.
6/9
- relative pose estimation compares the camera poses of two images
- Kinaema estimates the relative pose between an image and agent memory, where memory holds the scene and the current agent position.
6/9