- The large model runs on a single H100 GPU
- The small model fits within just 16GB of memory
You can access these models using open-source frameworks like Transformers and Ollama.
- The large model runs on a single H100 GPU
- The small model fits within just 16GB of memory
You can access these models using open-source frameworks like Transformers and Ollama.