Martin Genzel
banner
martingenzel.bsky.social
Martin Genzel
@martingenzel.bsky.social
Staff Machine Learning Researcher @MerantixMomentum | Applied Mathematician | Interested in Deep Learning, LLMs, Tabular & Time-Series Data | 🌐 martingenzel.com | GH martin-genzel | Berlin-based
The pruning order allows us to estimate the global importance of all target singular values. This gives rise to a score map that is used to implement an independent compression stage, where a user can flexibly create a model of any size without re-computation or re-calibration.
June 26, 2025 at 3:24 PM
The key idea of ACIP is to decouple an optimization-based pruning stage (calibration) from the actual compression stage. To ensure parameter-efficient pruning, we use low-rank factorizations and L1-regularization to iteratively eliminate singular values of large linear layers.
June 26, 2025 at 3:24 PM
To achieve this, we introduce Any Compression via Iterative Pruning (ACIP). This novel algorithm allows you to determine the entire compression-performance trade-off from a single gradient-descent run, enabling any target size for the model without re-computation.
June 26, 2025 at 3:24 PM
Turning the workflow around, we advocate for Any Compression: Perform a single, upfront computational step that then empowers users to generate a model at any desired size in real-time, without extra cost. In other words, you get a slider like in image compression 🎚️
June 26, 2025 at 3:24 PM
The conventional process with existing methods can be inefficient: You can typically pick one of a few preset target sizes, run a costly computation (calibration), and then must repeat the entire process for every new compression rate you want to test.
June 26, 2025 at 3:24 PM
Post-training compression is an effective way to make LLMs more accessible, but it creates a fundamental trade-off between size and performance. Unfortunately, the process can feel like a black box for users, requiring expertise and trial & error to find an acceptable setup.
June 26, 2025 at 3:24 PM
📢 Excited to share our latest research at Merantix Momentum on Any Compression of Foundation Models.

We all know how intuitive and seamless image compression is: use a slider to specify your target size and get an instant preview.
Our quest: Can compressing an LLM be just as easy?
🧵👇
June 26, 2025 at 3:24 PM