Jonas Spinner
jonasspinner.bsky.social
Jonas Spinner
@jonasspinner.bsky.social
PhD student working on machine learning for high energy physics. Interested in equivariant architectures and generative modelling.
Data augmentation (DA) emerges from LLoCa as the special case of random global frames, enabling a fair comparison between equivariance and augmentation. Equivariance excels in large-data regimes due to greater expressivity, while augmentation wins for little data.
6/6
June 2, 2025 at 9:43 AM
LLoCa assigns equivariantly predicted local reference frames to each particle, making their features invariant such that we can process them with any backbone architecture. This approach supports general internal representation through the way how messages are transformed between local frames.
2/6
June 2, 2025 at 9:43 AM
The DiscFormer training is similar to GANs, but requires neither a joint training nor a back-and-forth between classifier and generator. Unfortunately we did not get it to consistently improve upon standard likelihood training after working on it for over a year...
7/7
December 19, 2024 at 12:45 PM
Finally, an interesting but null result:
Appendix A is on a novel way to amplify likelihood training with classifier reweighting, aka DiscFormer. To avoid a classifier unweighting step after training, we reweight training data to increase the difference between model and data, aka DiscFormation.
6/7
December 19, 2024 at 12:45 PM
We try bootstrapping and two modified loss functions to tackle this task. We find that all three methods generate significantly more events with 8 jet. Plus, they get the kinematics correct at the level of statistical uncertainty in the training data. Yay!
5/7
December 19, 2024 at 12:45 PM
However, we find that events with 8 jets are much less likely to be generated. Can we find a way to modify the training process to increase the fraction of events with many jets?
4/7
December 19, 2024 at 12:45 PM
We train an autoregressive transformer on events with up to 6 jets. The model does not learn the multiplicity distribution perfectly, therefore it also generates a few accidental 7 jet events. This happens rarely, but we find that these events roughly have the correct kinematic distributions.
3/7
December 19, 2024 at 12:45 PM
QCD jet radiation follows a universal scaling pattern, reflecting the collinear factorization of matrix element and phase space. However, Later parts of the simulation chain violate this universality. It remains approximately valid, manifesting in the staircase scaling of jet multiplicities.
2/7
December 19, 2024 at 12:45 PM
We train a continuous normalizing flows with Riemannian flow matching and several choices for the vector field architecture, and compare them with our autoregressive density estimator 'JetGPT'. CNFs turn out to be more data-efficient, and turning them equivariant also helps.

6/7
November 25, 2024 at 3:27 PM
For the first time, we have trained a Lorentz-equivariant architecture on a real-world tagging dataset (JetClass = 100M jets). We find the hierarchy GNN < transformer < Lorentz-equivariant transformer, indicating that equivariance also matters at scale.

5/7
November 25, 2024 at 3:27 PM
We implement the L-GATr attention as a multiplicative list of signs for the queries in the inner product, and then use off-the-shelf attention kernels. WIth this trick, L-GATr scales to many tokens like standard transformers.

4/7
November 25, 2024 at 3:27 PM
To build L-GATr, we replace each transformer module with a version that processes geometric algebra objects in a Lorentz-equivariant way. Plus, there is a new operation in geometric algebra that allows us to add an extra layer, the geometric product.

3/7
November 25, 2024 at 3:27 PM
The Lorentz-Equivariant Geometric Algebra Transformer (L-GATr) uses spacetime geometric algebra to process particles at the LHC in a Lorentz-equivariant way. We process them using a transformer architecture, combining the benefits of Lorentz and permutation equivariance.

2/7
November 25, 2024 at 3:27 PM