Anthony Lewis
banner
anthonyllewis.bsky.social
Anthony Lewis
@anthonyllewis.bsky.social
Staff Software Engineer at Braintree
I honestly haven't tried again lately. When I first tried llm-mlx on 3.13 it failed. I found the note in the readme and switched to 3.12. I'll give it a shot and update the post when it's working.
June 3, 2025 at 3:02 PM
I haven't looked into using MLX with Ollama. For a simple codegen prompt on my M3 Max performance is similar without MLX (around 104 tps). With llm-mlx I get around 149 tps. I'm sure Ollama with MLX would be similar.
June 3, 2025 at 2:36 PM