bchidlovskii.bsky.social
@bchidlovskii.bsky.social
Reposted
In a new paper led by Gianluca Monaci, with @weinzaepfelp.bsky.social and myself, we explore the relationship between rel pose estimation and image goal navigation and study different architectures: late fusion, channel cat (w/ or w/o space2depth) and cross-attention.

arxiv.org/abs/2507.01667

🧵1/5
July 4, 2025 at 5:00 PM
Reposted
We find evidence that the agent has a plan structured on the level of paths and that its estimate of success goes beyond the effect of the next action. Abandoning a navigation option for a more promising one increases the value estimate, as the agent now expects a higher future return.
#CVPR2025
March 12, 2025 at 8:49 AM