lorenzvogel.bsky.social
@lorenzvogel.bsky.social
If you want to read more about uncertainty-aware neural networks, I highly recommend our #paper on machine-learned #uncertainties for the calibration of calorimeter signals in the #ATLAS experiment:

inspirehep.net/literature/2...
LinkedIn
This link will take you to a page that’s not on LinkedIn
lnkd.in
December 28, 2024 at 12:05 PM
Summary: the #BNN not only yields a continuous and smooth topo-cluster #calibration function that improves the performance relative to the standard LCW calibration — but also provides meaningful single-cluster #uncertainties on the predicted responses and the calibrated energies
December 9, 2024 at 10:55 AM
Both pulls follow an approximate Gaussian shape in the center (as expected for stochastic or noisy data) — and both networks slightly overestimate the uncertainty, meaning that the per-cluster error is conservative
December 9, 2024 at 10:55 AM
After checking that the BNN and RE uncertainties are highly comparable, they can be further evaluated with respect to the spread of the predicted response around the target — the "pull" allows us to test if the learned single-cluster uncertainty covers the experimental spread
December 9, 2024 at 10:54 AM
We can also directly compare the individual uncertainty predictions from the Bayesian neural network (BNN) and the repulsive ensemble (RE) cluster-by-cluster — the two uncertainty predictions track each other well
December 9, 2024 at 10:54 AM
The #systematic uncertainty (part of the likelihood) approaches the same plateau as for the BNN when increasing the training-dataset size (green and brown curves) — and the #statistical uncertainty (induced by the repulsive force) again vanishes (red and blue curves)
December 9, 2024 at 10:54 AM
The idea is to determine uncertainties by forcing an ensemble of simultaneously trained networks to not predict the same best-fit parameters, but to force the ensemble to spread out and to explore the loss landscape around the actual minimum
December 9, 2024 at 10:53 AM
Learned #uncertainties on neural-network outputs are not a standard method used in HEP... To increase confidence in the uncertainty predictions from our BNN setup, we compare our BNN results with an alternative way of learning uncertainties — so-called #repulsive #ensembles (REs)
December 9, 2024 at 10:53 AM
We see that these topo-clustes are all located in the tile-gap scintillator region: the tile-gap scintillator is not a regular calorimeter — the feature quality in this region is insufficient, so it is expected that the calibration in this region yields a large uncertainty
December 9, 2024 at 10:53 AM
An interesting question is what role the learned uncertainties can play in understanding the data... When looking at the uncertainty spectrum, we see a distinctive secondary maximum — what feature leads the BNN uncertainties to flag these topo-clusters?
December 9, 2024 at 10:53 AM
...(ii) the "systematic uncertainty" (green curve) captures the intrinsic data stochasticity (pile-up), and accounts for limited network expressivity and bad hyper-parameters — for learning the stochastic nature, more data helps, but it does not got to zero but approaches a finite plateau
December 9, 2024 at 10:52 AM
The total BNN uncertainty actually consists of two terms: (i) the "statistical uncertainty" (red curve) accounts for a lack of knowledge due to a limited amount of training data and vanishes in the limit of infinite training data, and...
December 9, 2024 at 10:51 AM
The improvements of the relative local energy resolution when evaluated as a function of the in-time (left) and out-of-time (right) pile-up activity shows a significant level of cluster-by-cluster pile-up mitigation when applying the ML-derived calibration
December 9, 2024 at 10:50 AM
Another performance measure is the relative energy resolution — again, the BNN is better over the whole energy range, and especially spectacular at low energies (it best learns the signal-source transition from inelastic hadronic interactions to ionisation-dominated signals)
December 9, 2024 at 10:49 AM
To evaluate the performance, we compare the BNN predictions to the target values in terms of the signal linearity (should peak at zero) — the BNN calibration performs significantly better than any other of the considered scales (with a significant precision gain at low energies)
December 9, 2024 at 10:49 AM
The corresponding loss function then follows a clear statistics logic: the first term can be seen as a weight regularization avoiding over-training, and the second term tries to maximize the likelihood
December 9, 2024 at 10:49 AM
The network training can be described as constructing a #variational approximation, where we approximate the intractable posterior with a simplified and tractable distribution — to learn the variational posterior we minimize the Kullback-Leibler (KL) divergence
December 9, 2024 at 10:48 AM
BNN weights are not trained as fixed values, the parameters are described by weight distributions — during inference, the learned weight distributions are sampled multiple times to generate an ensemble of networks, from which we construct the central value and the uncertainty
December 9, 2024 at 10:48 AM
Since control and #uncertainties are key in #HEP, our BNN is trained to also learn an uncertainty associated with the predicted #calibration function — this uncertainty allows for a better understanding of possible signal-quality issues in the data or training-related limitations
December 9, 2024 at 10:48 AM
#ML can be used to learn multi-dimensional continuous #calibration functions — we do this by training a #regression network (an uncertainty-aware BNN) to learn the "response" of single topo-clusters as a function over feature space using a properly defined minimization task
December 9, 2024 at 10:47 AM
The principal signals of the #ATLAS calorimeters are so-called "topo-clusters". The signals are calibrated to correctly measure the energy deposited by EM showers — but therefore they provide no compensation for energy losses in the complex development of hadronic showers...
December 9, 2024 at 10:47 AM
So when used correctly, #ML is a perfect tool to quantify and control different kinds of #uncertainties in #LHC physics — and the future of the LHC really is triggered, inspired and shaped by data science as a new common language of particle experiment and theory
December 9, 2024 at 10:46 AM