Lightnews — Scholar-powered news

Martin Genzel

@martingenzel.bsky.social

Posts Replies Media Videos

Martin Genzel

@martingenzel.bsky.social

The pruning order allows us to estimate the global importance of all target singular values. This gives rise to a score map that is used to implement an independent compression stage, where a user can flexibly create a model of any size without re-computation or re-calibration.

June 26, 2025 at 3:24 PM

Martin Genzel

@martingenzel.bsky.social

The key idea of ACIP is to decouple an optimization-based pruning stage (calibration) from the actual compression stage. To ensure parameter-efficient pruning, we use low-rank factorizations and L1-regularization to iteratively eliminate singular values of large linear layers.

June 26, 2025 at 3:24 PM

Martin Genzel

@martingenzel.bsky.social

To achieve this, we introduce Any Compression via Iterative Pruning (ACIP). This novel algorithm allows you to determine the entire compression-performance trade-off from a single gradient-descent run, enabling any target size for the model without re-computation.

June 26, 2025 at 3:24 PM

Martin Genzel

@martingenzel.bsky.social

Turning the workflow around, we advocate for Any Compression: Perform a single, upfront computational step that then empowers users to generate a model at any desired size in real-time, without extra cost. In other words, you get a slider like in image compression 🎚️

June 26, 2025 at 3:24 PM

Martin Genzel

@martingenzel.bsky.social

The conventional process with existing methods can be inefficient: You can typically pick one of a few preset target sizes, run a costly computation (calibration), and then must repeat the entire process for every new compression rate you want to test.

June 26, 2025 at 3:24 PM

Martin Genzel

@martingenzel.bsky.social

Post-training compression is an effective way to make LLMs more accessible, but it creates a fundamental trade-off between size and performance. Unfortunately, the process can feel like a black box for users, requiring expertise and trial & error to find an acceptable setup.

June 26, 2025 at 3:24 PM

Martin Genzel

@martingenzel.bsky.social

📢 Excited to share our latest research at Merantix Momentum on Any Compression of Foundation Models.

We all know how intuitive and seamless image compression is: use a slider to specify your target size and get an instant preview.
Our quest: Can compressing an LLM be just as easy?
🧵👇

June 26, 2025 at 3:24 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news