Jonas Spinner
jonasspinner.bsky.social
Jonas Spinner
@jonasspinner.bsky.social
PhD student working on machine learning for high energy physics. Interested in equivariant architectures and generative modelling.
Data augmentation (DA) emerges from LLoCa as the special case of random global frames, enabling a fair comparison between equivariance and augmentation. Equivariance excels in large-data regimes due to greater expressivity, while augmentation wins for little data.
6/6
June 2, 2025 at 9:43 AM
We create LLoCa-ParticleNet and LLoCa-ParT, Lorentz-equivariant versions of the established non-equivariant ParticleNet and ParT. The LLoCa variants consistently improve performance but are 2× slower. Interestingly, we find that a simple LLoCa-Transformer matches the LLoCa-ParT performance.
5/6
June 2, 2025 at 9:43 AM
Existing Lorentz-equivariant architectures like LorentzNet, PELICAN, and L-GATr rely on specialized layers for internal representations, limiting architectural choice and often requiring significant extra compute. LLoCa achieves similar (SOTA) performance while being 4× faster and more flexible.
4/6
June 2, 2025 at 9:43 AM
All in all, it takes two steps to make your architecture Lorentz-equivariant:
(1) use a small network that equivariantly predicts local frames, and express inputs in these local frames.
(2) add frame-to-frame transformations in the message passing (or attention) of your backbone architecture.
3/6
June 2, 2025 at 9:43 AM
LLoCa assigns equivariantly predicted local reference frames to each particle, making their features invariant such that we can process them with any backbone architecture. This approach supports general internal representation through the way how messages are transformed between local frames.
2/6
June 2, 2025 at 9:43 AM
The DiscFormer training is similar to GANs, but requires neither a joint training nor a back-and-forth between classifier and generator. Unfortunately we did not get it to consistently improve upon standard likelihood training after working on it for over a year...
7/7
December 19, 2024 at 12:45 PM
Finally, an interesting but null result:
Appendix A is on a novel way to amplify likelihood training with classifier reweighting, aka DiscFormer. To avoid a classifier unweighting step after training, we reweight training data to increase the difference between model and data, aka DiscFormation.
6/7
December 19, 2024 at 12:45 PM
We try bootstrapping and two modified loss functions to tackle this task. We find that all three methods generate significantly more events with 8 jet. Plus, they get the kinematics correct at the level of statistical uncertainty in the training data. Yay!
5/7
December 19, 2024 at 12:45 PM
However, we find that events with 8 jets are much less likely to be generated. Can we find a way to modify the training process to increase the fraction of events with many jets?
4/7
December 19, 2024 at 12:45 PM
We train an autoregressive transformer on events with up to 6 jets. The model does not learn the multiplicity distribution perfectly, therefore it also generates a few accidental 7 jet events. This happens rarely, but we find that these events roughly have the correct kinematic distributions.
3/7
December 19, 2024 at 12:45 PM
QCD jet radiation follows a universal scaling pattern, reflecting the collinear factorization of matrix element and phase space. However, Later parts of the simulation chain violate this universality. It remains approximately valid, manifesting in the staircase scaling of jet multiplicities.
2/7
December 19, 2024 at 12:45 PM
Reposted by Jonas Spinner
On Thursday from 11:00 to 14:00, I'll be cheering on @jonasspinner.bsky.social and Victor Bresó at poster 3911.

They built L-GATr 🐊: a transformer that's equivariant to the Lorentz symmetry of special relativity. It performs remarkably well across different tasks in high-energy physics.

2/6
December 11, 2024 at 5:15 AM
Thanks to the L-GATr team Victor Breso, Pim de Haan, Tilman Plehn, Huilin Qu, Jesse Thaler and @johannbrehmer.bsky.social

Looking forward to exciting discussions at NeurIPS!
November 25, 2024 at 3:27 PM
We train a continuous normalizing flows with Riemannian flow matching and several choices for the vector field architecture, and compare them with our autoregressive density estimator 'JetGPT'. CNFs turn out to be more data-efficient, and turning them equivariant also helps.

6/7
November 25, 2024 at 3:27 PM
For the first time, we have trained a Lorentz-equivariant architecture on a real-world tagging dataset (JetClass = 100M jets). We find the hierarchy GNN < transformer < Lorentz-equivariant transformer, indicating that equivariance also matters at scale.

5/7
November 25, 2024 at 3:27 PM
We implement the L-GATr attention as a multiplicative list of signs for the queries in the inner product, and then use off-the-shelf attention kernels. WIth this trick, L-GATr scales to many tokens like standard transformers.

4/7
November 25, 2024 at 3:27 PM
To build L-GATr, we replace each transformer module with a version that processes geometric algebra objects in a Lorentz-equivariant way. Plus, there is a new operation in geometric algebra that allows us to add an extra layer, the geometric product.

3/7
November 25, 2024 at 3:27 PM
The Lorentz-Equivariant Geometric Algebra Transformer (L-GATr) uses spacetime geometric algebra to process particles at the LHC in a Lorentz-equivariant way. We process them using a transformer architecture, combining the benefits of Lorentz and permutation equivariance.

2/7
November 25, 2024 at 3:27 PM