Lightnews — Scholar-powered news

Nasim Rahaman

@nasimrahaman.bsky.social

👨‍🍳 @ Tiptree Systems. Previously, deep learning & espresso slurping @ Mila and Max-Planck Institute for Intelligent Systems. Before that, physics and more deep learning @ Uni Heidelberg.

📍Berlin

Posts Replies Media Videos

Nasim Rahaman

@nasimrahaman.bsky.social

I do wonder how much of that fat tail is lost in simulation, and what the downstream effects are. Model autophagy disorder is not something you’d want for models making life or death decisions.

November 7, 2025 at 6:46 AM

Nasim Rahaman

@nasimrahaman.bsky.social

Imagine where we would be today if an overpowered Nazi Germany did not have US and USSR to counterbalance. Or just watch Man in the High Castle.

January 29, 2025 at 10:40 PM

Nasim Rahaman

@nasimrahaman.bsky.social

And in that world, the quality of the weapon is how cheap it runs for how good it is.

I won’t be surprised if nation states eventually train and host their own models. Heck, some LLM shops seem to be betting on that, e.g. Mistral and AlephAlpha with “European LLMs”.

January 17, 2025 at 10:56 PM

Nasim Rahaman

@nasimrahaman.bsky.social

We’re approaching an era of memetic warfare where LLMs are the weapons. We’re not there yet — the values espoused by Chinese LLMs aren’t all that different from American ones — but that’s for now.

But once LLMs become our primary interface with the outside world, it’s bound to happen.

January 17, 2025 at 10:56 PM

Nasim Rahaman

@nasimrahaman.bsky.social

Looks nice! Some FastAPI endpoints + a docker image should help adoption. :)

January 17, 2025 at 10:02 PM

Nasim Rahaman

@nasimrahaman.bsky.social

This is fascinating work, congratulations!

Question: the point that architectural constraints (locality + equivariance) are sufficient is well demonstrated.

But do you think it is necessary? I.e. would you expect a diffusion transformer to learn these constraints?

January 1, 2025 at 10:41 AM

Nasim Rahaman

@nasimrahaman.bsky.social

In other words, code 1 is more “multi-agent” than others.

What do I mean when I say “agent”? A system that we’d like to abstract away like a black box (Rovelli’s definition). Of that, I count three in code 1, and 1 in both codes 2 and 3.

December 21, 2024 at 10:39 PM

Nasim Rahaman

@nasimrahaman.bsky.social

Because code 1 is most explicit about the structure of the computational graph. :)

bsky.app/profile/nasi...

Nasim Rahaman @nasimrahaman.bsky.social · Nov 23

I’ll steelman the status quo understanding, even though I have beef with it.

Think of a multi-agent system as a computational graph (potentially with recurrence). The result of this is graph is a composition of different “functions”, where each function can be thought of as an agent.

December 21, 2024 at 10:29 PM

Nasim Rahaman

@nasimrahaman.bsky.social

If I were to force an answer, I’d say code 1 (prompt chaining) has more agent energy than the others.

December 20, 2024 at 11:35 PM

Nasim Rahaman

@nasimrahaman.bsky.social

Paper here:

Large Concept Models: Language Modeling in a Sentence Representation Space | Research - AI at Meta

LLMs have revolutionized the field of artificial intelligence and have emerged as the de-facto tool for many tasks. The current established technology of...

ai.meta.com

December 14, 2024 at 5:43 AM

Nasim Rahaman

@nasimrahaman.bsky.social

Have the token-level LLM predict “concept tokens”. The hidden states for these tokens go in to an adapter, and out come concept representations. Concept tokens attend to previous concept tokens, and perhaps also the span between itself and the previous concept token.

December 14, 2024 at 5:43 AM

Nasim Rahaman

@nasimrahaman.bsky.social

See paper for more.

+ Alejandro is at NeurIPS and figuring out where to do his PhD. Wink wink nudge nudge.

www.linkedin.com/in/alejandro...

Alien Recombination: Exploring Concept Blends Beyond Human Cognitive Availability in Visual Art

While AI models have demonstrated remarkable capabilities in constrained domains like game strategy, their potential for genuine creativity in open-ended domains like art remains debated. We explore t...

arxiv.org

December 13, 2024 at 9:41 AM

Nasim Rahaman

@nasimrahaman.bsky.social

Results? Good stuff.

🧵⤵️

December 13, 2024 at 9:41 AM

Nasim Rahaman

@nasimrahaman.bsky.social

The idea is to model two things:
(a) if concepts fit together to make good art, and
(b) if people have already thought about that combination of concepts (“cognitive availability”).

Seek out the combos for which (a) is true but (b) isn’t, and ask a text-to-image model to render that.

🧵⤵️

December 13, 2024 at 9:41 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news