André Martins
andre-t-martins.bsky.social
André Martins
@andre-t-martins.bsky.social
NLP/ML researcher in Lisbon
Our experiments show competitive / superior results in terms of coverage, efficiency, and adaptiveness compared to standard non-conformity scores based on softmax, such as InvProb and RAPS. 7/N
March 9, 2025 at 9:31 PM
As a bonus, for softmax (which is not sparse) we also obtain a new “log-margin” non-conformity score which is the log-odds ratio between the most probable class and the true one. 6/N
March 9, 2025 at 9:31 PM
For γ-entmax (which recovers sparsemax with γ=2 and softmax with γ=1), the non-conformity scores use the L_δ-norm (with δ = 1 / (γ - 1)) instead of the L_1 norm. 5/N
March 9, 2025 at 9:31 PM
The answer is yes! For sparsemax, this corresponds to a new non-conformity score which accumulates the absolute differences of logits up to the target label. 4/N
March 9, 2025 at 9:31 PM
These sparse transformations have a temperature parameter which controls the amount of sparsity. Can we use split conformal prediction to calibrate this temperature parameter and return sparse sets with coverage guarantees? 3/N
March 9, 2025 at 9:31 PM
Conformal prediction quantifies uncertainty by predicting *sets* instead of points, offering coverage guarantees. Sparse transformations (sparsemax, entmax, etc.) are softmax alternatives which return sparse probability vectors, useful to select a subset of relevant labels. 2/N
March 9, 2025 at 9:31 PM
Our upcoming #AISTATS2025 paper is out: “Sparse Activations as Conformal Predictors” with Margarida Campos, João Calém, Sophia Sklaviadis, and
@marfig.bsky.social:

arxiv.org/abs/2502.14773.

This paper puts together two lines of research, dynamic sparsity and conformal prediction. 🧵
March 9, 2025 at 9:31 PM