Mobius Labs
banner
mobius-labs.bsky.social
Mobius Labs
@mobius-labs.bsky.social
Making models fast, small and taming inference. Loves multimodality.

Proponents of Open Source and Open Intelligence.

https://blog.mobiuslabs.com/ for some of our recent work.

X: https://x.com/Mobius_Labs
Our re-distilled Deepseek R1 (1.5B) outperforms the original distilled model! Get it at huggingface.co/mobiuslabsgm.... We’re distilling more models and look forward to releasing them soon!
January 24, 2025 at 5:32 PM
Releasing a new version of Gemlite github.com/mobiusml/gem... significantly improved performance on datacenter GPUS (A100/H100) delivering up to 7–8x faster prefill and 3–6x faster batch decoding compared to PyTorch's tinygemm.
GitHub - mobiusml/gemlite: Fast low-bit matmul kernels in Triton
Fast low-bit matmul kernels in Triton. Contribute to mobiusml/gemlite development by creating an account on GitHub.
github.com
December 5, 2024 at 2:42 PM
Really happy to contribute to the batched version of faster-whisper that is 4x faster and more accurate 🚀🚀🚀

github.com/SYSTRAN/fast...
Release faster-whisper 1.1.0 · SYSTRAN/faster-whisper
New Features New batched inference that is 4x faster and accurate, Refer to README on usage instructions. Support for the new large-v3-turbo model. VAD filter is now 3x faster on CPU. Feature Extr...
github.com
November 25, 2024 at 11:32 AM