TLDR
🏆SoTA open-source multimodal
🧠 Capable of step-by-step reasonin
🔥 Beats GPT-4o and Sonnet 3.5 on MathVista and MathVision
You can now run inference and finetune (QLora) locally on your Mac.
> pip install mlx-vlm
Model cards 👇🏽
huggingface.co/collections/...
TLDR
🏆SoTA open-source multimodal
🧠 Capable of step-by-step reasonin
🔥 Beats GPT-4o and Sonnet 3.5 on MathVista and MathVision
You can now run inference and finetune (QLora) locally on your Mac.
> pip install mlx-vlm
Model cards 👇🏽
huggingface.co/collections/...
New models:
- @GoogleDeepMind Paligemma 2
Up next 🚧:
- Refactoring
Get started:
> pip install -U mlx-vlm
Please leave us a star and send a PR :)
github.com/Blaizzy/mlx-...
New models:
- @GoogleDeepMind Paligemma 2
Up next 🚧:
- Refactoring
Get started:
> pip install -U mlx-vlm
Please leave us a star and send a PR :)
github.com/Blaizzy/mlx-...
You can now run inference and fine-tune locally on your Mac.
pip install -U mlx-vlm
I’m getting ~140 tok/s on M3 Max 96GB 🔥
Thanks to @pcuenq.hf.co for PR!
Model Cards 👇🏽
You can now run inference and fine-tune locally on your Mac.
pip install -U mlx-vlm
I’m getting ~140 tok/s on M3 Max 96GB 🔥
Thanks to @pcuenq.hf.co for PR!
Model Cards 👇🏽
New models🤖:
- Allen AI Molmo
- Microsoft Florence 2
Changes 🚀:
- Fixed Pixtral image prompt h/t Nils
- 30-60% faster Qwen2-VL inference h/t Awni
- Fixed Qwen2-VL OCR
- Skip quant for vision encoder or layers.
- New notebooks
Please leave us a star and send a PR❤️
New models🤖:
- Allen AI Molmo
- Microsoft Florence 2
Changes 🚀:
- Fixed Pixtral image prompt h/t Nils
- 30-60% faster Qwen2-VL inference h/t Awni
- Fixed Qwen2-VL OCR
- Skip quant for vision encoder or layers.
- New notebooks
Please leave us a star and send a PR❤️