Nasim Rahaman
banner
nasimrahaman.bsky.social
Nasim Rahaman
@nasimrahaman.bsky.social
👨‍🍳 @ Tiptree Systems. Previously, deep learning & espresso slurping @ Mila and Max-Planck Institute for Intelligent Systems. Before that, physics and more deep learning @ Uni Heidelberg.

📍Berlin
I do wonder how much of that fat tail is lost in simulation, and what the downstream effects are. Model autophagy disorder is not something you’d want for models making life or death decisions.
November 7, 2025 at 6:46 AM
Imagine where we would be today if an overpowered Nazi Germany did not have US and USSR to counterbalance. Or just watch Man in the High Castle.
January 29, 2025 at 10:40 PM
And in that world, the quality of the weapon is how cheap it runs for how good it is.

I won’t be surprised if nation states eventually train and host their own models. Heck, some LLM shops seem to be betting on that, e.g. Mistral and AlephAlpha with “European LLMs”.
January 17, 2025 at 10:56 PM
We’re approaching an era of memetic warfare where LLMs are the weapons. We’re not there yet — the values espoused by Chinese LLMs aren’t all that different from American ones — but that’s for now.

But once LLMs become our primary interface with the outside world, it’s bound to happen.
January 17, 2025 at 10:56 PM
Looks nice! Some FastAPI endpoints + a docker image should help adoption. :)
January 17, 2025 at 10:02 PM
This is fascinating work, congratulations!

Question: the point that architectural constraints (locality + equivariance) are sufficient is well demonstrated.

But do you think it is necessary? I.e. would you expect a diffusion transformer to learn these constraints?
January 1, 2025 at 10:41 AM
In other words, code 1 is more “multi-agent” than others.

What do I mean when I say “agent”? A system that we’d like to abstract away like a black box (Rovelli’s definition). Of that, I count three in code 1, and 1 in both codes 2 and 3.
December 21, 2024 at 10:39 PM
Because code 1 is most explicit about the structure of the computational graph. :)

bsky.app/profile/nasi...
I’ll steelman the status quo understanding, even though I have beef with it.

Think of a multi-agent system as a computational graph (potentially with recurrence). The result of this is graph is a composition of different “functions”, where each function can be thought of as an agent.
December 21, 2024 at 10:29 PM
If I were to force an answer, I’d say code 1 (prompt chaining) has more agent energy than the others.
December 20, 2024 at 11:35 PM
Have the token-level LLM predict “concept tokens”. The hidden states for these tokens go in to an adapter, and out come concept representations. Concept tokens attend to previous concept tokens, and perhaps also the span between itself and the previous concept token.
December 14, 2024 at 5:43 AM
See paper for more.

+ Alejandro is at NeurIPS and figuring out where to do his PhD. Wink wink nudge nudge.

www.linkedin.com/in/alejandro...
Alien Recombination: Exploring Concept Blends Beyond Human Cognitive Availability in Visual Art
While AI models have demonstrated remarkable capabilities in constrained domains like game strategy, their potential for genuine creativity in open-ended domains like art remains debated. We explore t...
arxiv.org
December 13, 2024 at 9:41 AM
Results? Good stuff.

🧵⤵️
December 13, 2024 at 9:41 AM
The idea is to model two things:
(a) if concepts fit together to make good art, and
(b) if people have already thought about that combination of concepts (“cognitive availability”).

Seek out the combos for which (a) is true but (b) isn’t, and ask a text-to-image model to render that.

🧵⤵️
December 13, 2024 at 9:41 AM