Lightnews — Scholar-powered news

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

Spending the holidays teaching my little nephews some machine-learning basics — it's not that difficult; the Bayesian neural network (BNN) loss function follows a clear statistics logic

December 28, 2024 at 12:04 PM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

This has become one of my most favorite (pre-)Christmas traditions: the annual #Glühwein workshop — this year at the Karlsruher Institut für Technologie (KIT), organized by Markus Klute (thank you!)

December 16, 2024 at 10:30 PM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

Both pulls follow an approximate Gaussian shape in the center (as expected for stochastic or noisy data) — and both networks slightly overestimate the uncertainty, meaning that the per-cluster error is conservative

December 9, 2024 at 10:55 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

We can also directly compare the individual uncertainty predictions from the Bayesian neural network (BNN) and the repulsive ensemble (RE) cluster-by-cluster — the two uncertainty predictions track each other well

December 9, 2024 at 10:54 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

The #systematic uncertainty (part of the likelihood) approaches the same plateau as for the BNN when increasing the training-dataset size (green and brown curves) — and the #statistical uncertainty (induced by the repulsive force) again vanishes (red and blue curves)

December 9, 2024 at 10:54 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

The idea is to determine uncertainties by forcing an ensemble of simultaneously trained networks to not predict the same best-fit parameters, but to force the ensemble to spread out and to explore the loss landscape around the actual minimum

December 9, 2024 at 10:53 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

We see that these topo-clustes are all located in the tile-gap scintillator region: the tile-gap scintillator is not a regular calorimeter — the feature quality in this region is insufficient, so it is expected that the calibration in this region yields a large uncertainty

December 9, 2024 at 10:53 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

An interesting question is what role the learned uncertainties can play in understanding the data... When looking at the uncertainty spectrum, we see a distinctive secondary maximum — what feature leads the BNN uncertainties to flag these topo-clusters?

December 9, 2024 at 10:53 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

The total BNN uncertainty actually consists of two terms: (i) the "statistical uncertainty" (red curve) accounts for a lack of knowledge due to a limited amount of training data and vanishes in the limit of infinite training data, and...

December 9, 2024 at 10:51 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

The improvements of the relative local energy resolution when evaluated as a function of the in-time (left) and out-of-time (right) pile-up activity shows a significant level of cluster-by-cluster pile-up mitigation when applying the ML-derived calibration

December 9, 2024 at 10:50 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

Another performance measure is the relative energy resolution — again, the BNN is better over the whole energy range, and especially spectacular at low energies (it best learns the signal-source transition from inelastic hadronic interactions to ionisation-dominated signals)

December 9, 2024 at 10:49 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

To evaluate the performance, we compare the BNN predictions to the target values in terms of the signal linearity (should peak at zero) — the BNN calibration performs significantly better than any other of the considered scales (with a significant precision gain at low energies)

December 9, 2024 at 10:49 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

The network training can be described as constructing a #variational approximation, where we approximate the intractable posterior with a simplified and tractable distribution — to learn the variational posterior we minimize the Kullback-Leibler (KL) divergence

December 9, 2024 at 10:48 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

BNN weights are not trained as fixed values, the parameters are described by weight distributions — during inference, the learned weight distributions are sampled multiple times to generate an ensemble of networks, from which we construct the central value and the uncertainty

December 9, 2024 at 10:48 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

#ML can be used to learn multi-dimensional continuous #calibration functions — we do this by training a #regression network (an uncertainty-aware BNN) to learn the "response" of single topo-clusters as a function over feature space using a properly defined minimization task

December 9, 2024 at 10:47 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

The principal signals of the #ATLAS calorimeters are so-called "topo-clusters". The signals are calibrated to correctly measure the energy deposited by EM showers — but therefore they provide no compensation for energy losses in the complex development of hadronic showers...

December 9, 2024 at 10:47 AM

lorenzvogel.bsky.social

@lorenzvogel.bsky.social

Welcome to the machine!

After more than one year, the time has finally come: the @uniheidelberg.bsky.social non-ATLAS HEP-ML group has published its first (preprint) paper in collaboration with the @atlasexperiment.bsky.social

arxiv.org/abs/2412.04370
inspirehep.net/literature/2...

December 9, 2024 at 10:44 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news