1. Which quantization of model is it for either case?
2. When you load the model in LM Studio, what is the % of GPU offload you define?
It might be easier to go back and forth in a github issue: github.com/lmstudio-ai/...
Thanks!
1. Which quantization of model is it for either case?
2. When you load the model in LM Studio, what is the % of GPU offload you define?
It might be easier to go back and forth in a github issue: github.com/lmstudio-ai/...
Thanks!