Karim Farid
banner
kifarid.bsky.social
Karim Farid
@kifarid.bsky.social
PhD. Student @ELLIS.eu @UniFreiburg with Thomas Brox and Cordelia Schmid

Understanding intelligence and cultivating its societal benefits

https://kifarid.github.io
Idk, but maybe not necessarily, we observe discrete tokens but the language states themselves can live in a continuous world.
October 14, 2025 at 12:43 PM
Generative models that assume the underlying distribution is continuous, for example, flow matching and common diffusion models.
October 13, 2025 at 2:20 PM
Orbis shows that the objective matters.
Continuous modeling yields more stable and generalizable world models, yet true probabilistic coverage remains a challenge.

Immensely grateful to my co-authors @arianmousakhan.bsky.social, Sudhanshu Mittal, and Silvio Galesso, and to @thomasbrox.bsky.social
October 12, 2025 at 3:51 PM
Under the hood 🧠

Orbis uses a hybrid tokenizer with semantic + detail tokens that work in both continuous and discrete spaces.
The world model then predicts the next frame by gradually denoising or unmasking it, using past frames as context.
October 12, 2025 at 3:31 PM
Realistic and Diverse Rollouts 4/4
October 12, 2025 at 3:26 PM
Realistic and Diverse Rollouts 3/4
October 12, 2025 at 3:25 PM
Realistic and Diverse Rollouts 2/4
October 12, 2025 at 3:25 PM
Realistic and Diverse Rollouts 1/4
October 12, 2025 at 3:25 PM
While other models drift or blur on turns, Orbis stays on track — generating realistic, stable futures beyond the training horizon.

On our curated nuPlan-turns dataset, Orbis achieves better FVD, precision, and recall, capturing both visual and dynamics realism.
October 12, 2025 at 3:18 PM
We ask how continuous vs. discrete models and their tokenizers shape long-horizon behavior.

Findings:
Continuous models (Flow Matching) are
• Far less brittle to design choices
• Produce realistic, stable rollouts up to 20s
• And generalize better to unseen driving conditions

Continuous > Discrete
October 12, 2025 at 3:01 PM
Driving world models look good for a few frames, then they drift, blur, or freeze, especially when a turn or complex scene appears. These failures reveal a deeper issue: models aren’t capturing real dynamics. We introduce new metrics to measure such breakdowns.
October 12, 2025 at 2:53 PM
The question raised here is whether this approach is a generalist or a specialist that cannot transcend to the G-foundation state.
October 12, 2025 at 1:51 PM
I think HRM is quite great too. I would say they contributed the main idea (deep supervision) behind TRM.
October 12, 2025 at 1:51 PM
Transformers do not need to have something like "gradient descent" as an emergent property when it is kind of baked into it.
October 12, 2025 at 1:50 PM