valeo.ai
banner
valeoai.bsky.social
valeo.ai
@valeoai.bsky.social
We are a research team on artificial intelligence for automotive applications working toward assisted and autonomous driving.
--> https://valeoai.github.io/ <--
Pinned
1/๐Ÿงต Q: Can we have both a simple and SOTA architecture in autonomous driving?
R: Yes! ๐Ÿ˜
Introducing Driving on Registers (DrivoR):
a pure Transformer backbone that achieves SOTA results in NAVSIM v1 / v2 and closed-loop HUGSIM evaluation.
Here is how ๐Ÿ‘‡
Reposted by valeo.ai
IPA: An Information-Reconstructive Input Projection Framework for Efficient Foundation Model Adap...

Yuan Yin, Shashanka Venkataramanan, Tuan-Hung Vu, Andrei Bursuc, Matthieu Cord

Action editor: Ofir Lindenbaum

https://openreview.net/forum?id=aLmQeZx2pR

#projector #adaptation
February 3, 2026 at 5:19 AM
Reposted by valeo.ai
The unreasonable magic of simplicity!
Meet DrivoR (Driving on Registers): our latest end2end autonomous driving model.
We teared down complex dependencies & modules from current models to
obtain a pure Transformer-based SOTA driving agent (NAVSIM v1 & v2, HUGSIM).
Find out more ๐Ÿ‘‡
1/๐Ÿงต Q: Can we have both a simple and SOTA architecture in autonomous driving?
R: Yes! ๐Ÿ˜
Introducing Driving on Registers (DrivoR):
a pure Transformer backbone that achieves SOTA results in NAVSIM v1 / v2 and closed-loop HUGSIM evaluation.
Here is how ๐Ÿ‘‡
January 9, 2026 at 5:02 PM
7/ ๐Ÿ“„ Read the paper & get the code: valeoai.github.io/driving-on-r...

Congratulations to the whole team!
January 9, 2026 at 5:00 PM
6/ Furthermore, this scoring architecture allowed us to tweak the agent's behavior.

We were able to induce a more passive, safer driving styleโ€”which proved important for reaching SOTA performance on the rigorous NAVSIM-v2 benchmark. ๐Ÿ›ก๏ธ
January 9, 2026 at 4:57 PM
5/ Given the success of trajectory scoring methods (like GTRS), we dove deep into the scoring module.
Thanks to the wizardry of Yihong Xu, we discovered that disentangling the tokens used for generation from those used for scoring was key.
January 9, 2026 at 4:56 PM
4/ This mimics human driving intuition! ๐Ÿง 
We pay max attention to the road ahead (front camera), while only occasionally glancing at the rear (back camera).
Visualizing the attention maps confirms this: front tokens specialize; back tokens collapse to a single pattern.
January 9, 2026 at 4:56 PM
3/ These registers act as "scene-tokens" and demonstrate signs of learned compression.
Cosine similarity analysis reveals high differentiation for the front camera, while representations progressively "collapse" as we move toward the back camera.
January 9, 2026 at 4:56 PM
2/ We explored specific reasons to use a pre-trained ViT as image encoder.
We imbue DINOv2 with registers LoRA-finetuned on driving data, reducing the # of patch tokens over 250x using camera aware register tokens.
This efficiency could impact future works on VLMs in driving
January 9, 2026 at 4:55 PM
1/๐Ÿงต Q: Can we have both a simple and SOTA architecture in autonomous driving?
R: Yes! ๐Ÿ˜
Introducing Driving on Registers (DrivoR):
a pure Transformer backbone that achieves SOTA results in NAVSIM v1 / v2 and closed-loop HUGSIM evaluation.
Here is how ๐Ÿ‘‡
January 9, 2026 at 4:55 PM
Our @spyrosgidaris.bsky.social is speaking this morning (Wed, Dec 10th, 11:00 am Paris time) about "Latent Representations for Better Generative Image Modeling" in the Hi! PARIS - ELLIS monthly seminar.

The talk will be live-streamed: www.hi-paris.fr/2025/09/26/a...
AI Seminar Cycle โ€“ Hi! PARIS
www.hi-paris.fr
December 10, 2025 at 9:15 AM
Perfect timing for this keynote on open, re-purposable foundation models at #aiPULSE2025
@abursuc.bsky.social taking the stage this afternoon! ๐Ÿ‘‡
I'm speaking at #aiPULSE2025 today on Open & re-purposable foundation models for the automotive industry.
The morning keynotes talked a lot about open source so my slide here might be timely.
December 4, 2025 at 12:14 PM
Find out more about all these works at the posters, over a coffee or, if youโ€™re shy, on our webpage: valeoai.github.io/posts/neurip...
valeo.ai at NeurIPS 2025 | valeo.ai - valeo.ai research page
Loรฏck Chambon, Spyros Gidaris, Andrei Bursuc, Eloi Zablocki
valeoai.github.io
December 3, 2025 at 10:52 PM
IPA: An Information-Preserving Input Projection Framework for Efficient Foundation Model Adaptation

by: Y. Yin, S. Venkataramanan, T.H. Vu, A. Bursuc, M. Cord
๐Ÿ“„: arxiv.org/abs/2509.04398

tl;dr: a PEFT method that improves upon LoRA by explicitly preserving information in the low-rank space
December 3, 2025 at 10:52 PM
Multi-Token Prediction Needs Registers

by: A. Gerontopoulos, S. Gidaris, N. Komodakis
๐Ÿ“„: arxiv.org/abs/2505.10518

tl;dr: a simple way to enable multi-token prediction in LLMs by interleaving learnable "register tokens" into the input sequence to forecast future targets.
December 3, 2025 at 10:51 PM
Boosting Generative Image Modeling via Joint Image-Feature Synthesis

by: T. Kouzelis, E. Karypidis, I. Kakogeorgiou, S. Gidaris, N. Komodakis
๐Ÿ“„: arxiv.org/abs/2504.16064

- tl;dr: improve generation w/ a single diffusion model to jointly synthesize low-level latents & high-level semantic features
December 3, 2025 at 10:51 PM
Learning to Steer: Input-dependent Steering for Multimodal LLMs

by: J. Parekh, P. Khayatan, M. Shukor, A. Dapogny, A. Newson, M. Cord
๐Ÿ“„: arxiv.org/abs/2508.12815

- tl;dr: steering multimodal LLMs (MLLMs) by training a lightweight auxiliary module to predict input-specific steering vectors
December 3, 2025 at 10:51 PM
DINO-Foresight: Looking into the Future with DINO

by E. Karypidis, I. Kakogeorgiou, S. Gidaris, N. Komodakis
๐Ÿ“„: arxiv.org/abs/2412.11673

tl;dr: self-supervision by predicting future scene dynamics in the semantic feature space of foundation models (like DINO) rather than generating costly pixels.
December 3, 2025 at 10:50 PM
JAFAR: Jack up Any Feature at Any Resolution

by P. Couairon, L. Chambon, L. Serrano, M. Cord, N. Thome
๐Ÿ“„: arxiv.org/abs/2506.11136

tl;dr: lightweight, flexible, plug & play upsampler that scales features from any vision foundation model to arbitrary resolutions w/o needing high-res supervision
December 3, 2025 at 10:50 PM
Check out our works at @NeurIPSConf #NeurIPS2025 this week!
We present 5 full papers + 1 workshop about:
๐Ÿ’ก self-supervised & representation learning
๐Ÿ–ผ๏ธ generative image models
๐Ÿง  finetuning and understanding LLMs & multimodal LLMs
๐Ÿ”Ž feature upsampling

valeoai.github.io/posts/neurip...
December 3, 2025 at 10:50 PM
Reposted by valeo.ai
We fermented our thoughts on understanding LoRA & ended up with IPA๐Ÿบ
We found an asymmetry in LoRA: during training, A changes little & B eats most task-specific adaptation.
So we pre-train A to preserve information before adaptation w/ excellent parameter efficiency #NeurIPS2025 #CCFM ๐Ÿ‘‡
1/Serve your PEFT with a fresh IPA!๐Ÿบ
Finetuning large models is cheaper thanks to LoRA, but is its random init optimal?๐Ÿค”
Meet IPA: a feature-aware alternative to random projections
#NeurIPS2025 WS #CCFM Oral+Best Paper
Work w/
S. Venkataramanan @tuanhungvu.bsky.social @abursuc.bsky.social M. Cord
๐Ÿงต
December 2, 2025 at 11:16 AM
Reposted by valeo.ai
1/Serve your PEFT with a fresh IPA!๐Ÿบ
Finetuning large models is cheaper thanks to LoRA, but is its random init optimal?๐Ÿค”
Meet IPA: a feature-aware alternative to random projections
#NeurIPS2025 WS #CCFM Oral+Best Paper
Work w/
S. Venkataramanan @tuanhungvu.bsky.social @abursuc.bsky.social M. Cord
๐Ÿงต
December 2, 2025 at 11:11 AM
Reposted by valeo.ai
That was a cool project brillantly led by Ellington Kirby during his internship.
We were curious if we could train diffusion models on sets of point coordinates.

For images, this is a step towards spatial diffusion, with pixels reorganizing themselves, instead of diffusing in rgb values space only.
LOGen: Toward Lidar Object Generation by Point Diffusion

by: E. Kirby, @mickaelchen.bsky.social, R. Marlet, N. Samet

tl;dr: a diffusion-based method producing lidar point clouds of dataset objects, with an extensive control of the generation

๐Ÿ“„ arxiv.org/abs/2412.07385
Code: โœ…
November 26, 2025 at 1:19 PM
Reposted by valeo.ai
Check out NAF: an effective ViT feature upsampler to produce excellent (and eye-candy) pixel-level feature maps.

NAF outperform both VFM-specific upsamplers (FeatUp, JAFAR) and VFM-agnostic methods (JBU, AnyUp) over multiple downstream tasks ๐Ÿ‘‡
Need pixel-level features from your backbone (DINOv3, CLIP, RADIO, FRANCA...)?

๐Ÿš€Introducing NAF: A universal, zero-shot feature upsampler.

It turns low-res ViT features into pixel-perfect maps.

-โšก Model-agnostic
-๐Ÿฅ‡ SoTA results
-๐Ÿš€ 4ร— faster than SoTA
-๐Ÿ“ˆ Scales up to 2K res
November 25, 2025 at 6:36 PM
๐Ÿ“ข NAF is fully open-source!

The repo contains:
โœ… Pretrained model
โœ… Example notebooks
โœ… Evaluation and training codes

Check it out & โญ the repo: github.com/valeoai/NAF
November 25, 2025 at 10:44 AM