Matan Abudy
matanabudy.bsky.social
Matan Abudy
@matanabudy.bsky.social
Computational Linguistics @ TAU
What we show:
1️⃣ L1/L2/no regularization destroys perfectly generalizing solutions across 6 formal-language tasks
2️⃣ MDL keeps or improves on these same perfect solutions

(6/7)
May 24, 2025 at 4:07 PM
We propose Minimum Description Length (MDL) as a principled alternative. It balances accuracy with simplicity of the network - its information content rather than just weight's magnitude.
This encourages true generalization and can also yield more interpretable models.

(5/7)
May 24, 2025 at 4:07 PM
Traditional regularizers penalize large weights to make models “simpler.”
But smaller ≠ simpler.
Networks can still “smuggle” complexity - even in tiny numbers.

(4/7)
May 24, 2025 at 4:07 PM
Imagine memorizing an entire book using just one weight—by encoding the book as a binary string and placing it after a decimal point (if assuming very large / infinite precision).

Tiny weight, full memorization.

(3/7)
May 24, 2025 at 4:07 PM
Neural networks can, in theory, solve structured symbolic tasks perfectly.
Yet, standard regularizers (like L1 and L2) actively push models away from these perfect solutions.

Full paper 👉 arxiv.org/abs/2505.13398

(2/7)
A Minimum Description Length Approach to Regularization in Neural Networks
State-of-the-art neural networks can be trained to become remarkable solutions to many problems. But while these architectures can express symbolic, perfect solutions, trained models often arrive at a...
arxiv.org
May 24, 2025 at 4:01 PM