Johan Edstedt
parskatt.bsky.social
Johan Edstedt
@parskatt.bsky.social
PhD student @ Linköping University

I like 3D vision and training neural networks.
Code: https://github.com/parskatt
Weights: https://github.com/Parskatt/storage/releases/tag/roma
February 6, 2026 at 10:05 PM
Reposted by Johan Edstedt
Take me down to the Parallax city where the far moves slow and the near moves quickly
February 1, 2026 at 3:40 PM
Is there any current 3/4D reconstruction method able to accurately reconstruct this scene? (with correct size of thunderstorm)
~ chasing a beautiful supercell thunderstorm across south-central Nebraska on July 1st of 2024 ~
January 30, 2026 at 1:08 PM
regardless of your views on AI, i strongly agree with this viewpoint! you are not funded by taxpayers to optimize your knowledge consumption workflows. it's good that you enjoy your job, you are getting paid to enjoy your job
regardless of your views on AI, i strongly disagree with this viewpoint! you are funded by taxpayers to perform an important service. it's good that you enjoy your job, but are not getting paid to enjoy your job
January 26, 2026 at 6:35 PM
January 21, 2026 at 4:09 PM
Correspondence is a much prettier word than match.

Pixel correspondence makes more sense than feature matching (what is a feature?).
January 13, 2026 at 10:30 AM
you're welcome
January 12, 2026 at 3:11 PM
Can someone please explain why 3DV papers are consistently interesting and ICCV/CVPR (3D) papers are consistently boring?
Reviewing for CVPR is sadly very boring.
January 9, 2026 at 2:21 PM
Something I predict for academic papers is that (complex) implementations/proofs will get significantly less credit than previously due to LLMs.

Not sure if this will be a good or bad thing.
January 7, 2026 at 4:59 PM
Have to say I like the "Nice ---" style of GPT models.
December 19, 2025 at 9:28 AM
Sad to report that I now have the type of brain damage where you have to write 1024 instead of 1000.
December 17, 2025 at 12:55 PM
Since this take is so completely wrong I have to repost as well.
The posted figure is only showing the top-50.
No shit it's gonna show the biggest companies and unis.

There's a productive conversation to be had, but this aint it.

aiworld.eu/story/from-b...
December 15, 2025 at 3:09 PM
At this point I'm extremely allergic to any non top-level defaults.
December 11, 2025 at 1:33 PM
CVPR submissions now have this retracted look, presumably due to previous OpenReview leaks.
December 9, 2025 at 11:36 AM
Not sure if anyone noticed but there are pycolmap CUDA wheels since a month back:

pypi.org/project/pyco...
pycolmap-cuda12
COLMAP bindings
pypi.org
December 4, 2025 at 7:39 AM
December 1, 2025 at 10:40 PM
My predictions:

- debacle will not impact reviewing in a significant way
- ratings will likely be similar next year
- peer review will continue to struggle with review load
November 30, 2025 at 1:35 PM
Oof
November 28, 2025 at 2:52 PM
November 27, 2025 at 6:42 AM
Reposted by Johan Edstedt
SPIDER: Spatial Image CorresponDence Estimator for Robust Calibration

Zhimin Shao, Abhay Yadav, Rama Chellappa, Cheng Peng

tl;dr: 3D VFM+2D ConvNet->feature extraction backbone; 3D descriptor head (for geometry)+2D warp head (for pattern) fusion

arxiv.org/abs/2511.17750
November 25, 2025 at 3:11 PM
Pixel reconstruction has recently been somewhat overshadowed by latent SSL approaches such as DINO.

However, for 3D tasks we show that a scaled and simplified version of multi-view MAE (which we call MuM) can outperform DINOv3, all while using orders of magnitude less compute!
We are introducing MuM, a feature encoder (ViT-L) tailored for 3D vision tasks.

TLDR; Spiritual successor to CroCo with a simpler multi-view objective and larger scale. Beats DINOv3 and CroCo v2 in RoMa, feedforward reconstruction, and rel. pose.

arxiv.org/abs/2511.17309
github.com/davnords/mum
November 24, 2025 at 11:13 AM
November 21, 2025 at 3:20 PM
Put up my simple skysegmentation model on github over at github.com/Parskatt/sky...

Results are pretty crisp, but it doesn't really deal with clouds (it's literally just a linear model on top of some coarse segmentation output).
November 21, 2025 at 3:11 PM
RoMa v2 is now out! (github.com/Parskatt/rom..., arxiv.org/abs/2511.15706)

Here are the main improvements we made since RoMa:
November 20, 2025 at 9:25 AM
Can someone more familiar with sota diffusion tell me what's currently typically used, and does it matter at scale?
November 18, 2025 at 1:21 PM