Lightnews — Scholar-powered news

BenjMurrell

@benjmurrell.bsky.social

We have antibody sequences (which you can read like text - maybe I'll pull out some sample trajectories and drop them in here) and we're getting started with text proper. Images are a less natural fit because of their grid structure!

November 10, 2025 at 1:33 PM

BenjMurrell

@benjmurrell.bsky.social

Remind me to show you the twitter screenshot someone pinned on the wall in my office after my first Omicron neut thread 🤣

November 10, 2025 at 12:55 PM

BenjMurrell

@benjmurrell.bsky.social

Ah @hedwignordlinder.bsky.social is on here too!

November 10, 2025 at 9:34 AM

BenjMurrell

@benjmurrell.bsky.social

Oh and as usual, these visualizations (which are, to us, key to understanding these processes) were made by @antonoresten.bsky.social in @makie.org (@simi.bsky.social).

November 10, 2025 at 9:10 AM

BenjMurrell

@benjmurrell.bsky.social

With my wonderful lab, who mostly aren't on here (except @lukasbillera.bsky.social and @antonoresten.bsky.social ?) we've been tinkering in this space since the end of the summer, but we think this is just too cool to sit on any longer.

The manuscript should be up by tomorrow and I'll drop a link.

November 10, 2025 at 9:10 AM

BenjMurrell

@benjmurrell.bsky.social

We've got a flexible implementation in Julia (github.com/MurrellGroup...) that uses our Flowfusion.jl package so you can compose families of base flows with the branching and deletion process. If there is community interest, we can set up a python implementation as well.

GitHub - MurrellGroup/BranchingFlows.jl

Contribute to MurrellGroup/BranchingFlows.jl development by creating an account on GitHub.

github.com

November 10, 2025 at 9:10 AM

BenjMurrell

@benjmurrell.bsky.social

Eg. here is a protein example where the model designs two domains with an intervening linker (light blue chain in vid). A regular flow model would need to know, early on, exactly how many AAs are needed in the linker, but Branching Flows can decide on-the-fly and grow or shrink it.

November 10, 2025 at 9:10 AM

BenjMurrell

@benjmurrell.bsky.social

We dislike ad hoc padding in discrete diffusion models, and the situation is even worse in continuous domains. Branching Flows removes this blemish. We also expect that this makes the trajectories easier to learn.

November 10, 2025 at 9:10 AM

BenjMurrell

@benjmurrell.bsky.social

Given that all of the deletions, trees, anchors, etc are in Z, which doesn't interact with the theory, it is easy to manipulate the trajectories the model is learning.

Autocorrelated insertions? Change how you build the trees! Same for the "anchors" which control the process on internal branches.

November 10, 2025 at 9:10 AM

BenjMurrell

@benjmurrell.bsky.social

You can see the branching pattern in the static plots "with trails", and if you stare at them you can see where lineages delete too.

November 10, 2025 at 9:10 AM

BenjMurrell

@benjmurrell.bsky.social

The process can be seen clearly with a QM9-trained model (continuous atom positions, discrete atom types), starting from a single atom.

November 10, 2025 at 9:10 AM

BenjMurrell

@benjmurrell.bsky.social

Side note: this is heavily inspired by the processes from phylogenetics (the other thing my lab works on).

November 10, 2025 at 9:10 AM

BenjMurrell

@benjmurrell.bsky.social

The trick that makes it tractable to learn, for eg. continuous states, is that branching events are not generic insertions into space (which, we think, you need for TD jumps (@arnauddoucet.bsky.social ?). Instead, they duplicate the state that is branching.

November 10, 2025 at 9:10 AM

BenjMurrell

@benjmurrell.bsky.social

Then on the internal nodes of the trees, we place "anchor" states (same space as the X1 elements). We put all that in Z.

Then, given this Z, Xt evolves over the trees, sampling when (but not which) branching and deletion events occur, all constructed to terminate at X1.

November 10, 2025 at 9:10 AM

BenjMurrell

@benjmurrell.bsky.social

With Branching Flows, we draw X1 from the data. Then we draw X0 (one element? many elements? it all works). Then we add "to-be-deleted" nodes into X1. Then we draw a forest of trees, one per X0 element, that maps (one to many) the X0 elements to all the X1s (plus to-be-deleteds).

November 10, 2025 at 9:10 AM

BenjMurrell

@benjmurrell.bsky.social

Typically Z will include X0 from an easy-to-sample distribution and X1 from the data distribution, and if you're feeling fancy you might couple them ( @alextong.bsky.social ). But what we learned while working on Branching Flows is that Z is a *playground*.

November 10, 2025 at 9:10 AM

BenjMurrell

@benjmurrell.bsky.social

How does this work? First, you need to understand the amazing Generator Matching (arxiv.org/abs/2410.20587). In GM you first sample Z, and then construct a stochastic process Xt that, conditioned on Z, terminates at the data distribution.