Arxiv: arxiv.org/pdf/2509.01624
Github: github.com/enyac-group/...
Arxiv: arxiv.org/pdf/2509.01624
Github: github.com/enyac-group/...
🔧 Supports W4A8 / W4A16 / W4AX / W8A8 for Mamba1 and Mamba2
🚀 Achieves 4× memory reduction and 3× generation speedup
⚡️ Enables 8B model inference on Orin Nano 8G at 13 tokens/sec
🔥 Outperforms W4A8KV4 Llama3-8B in both speed and quality
🔧 Supports W4A8 / W4A16 / W4AX / W8A8 for Mamba1 and Mamba2
🚀 Achieves 4× memory reduction and 3× generation speedup
⚡️ Enables 8B model inference on Orin Nano 8G at 13 tokens/sec
🔥 Outperforms W4A8KV4 Llama3-8B in both speed and quality
It takes about 3 minutes.
Participate here: t.co/WXuPdg9HKv
It takes about 3 minutes.
Participate here: t.co/WXuPdg9HKv