Author | Lightnews

Taneem

@taneem-ibrahim.bsky.social

110 followers 94 following 5 posts

Tinkering with vLLM @RedHat

Posts Replies Media Videos

Taneem

@taneem-ibrahim.bsky.social

I had an amazing experience attending @fastcompany.com Most Innovative Companies Summit. Proud to represent Red Hat as one of the most innovative companies with my colleague @terrytangyuan.xyz

June 6, 2025 at 5:17 AM

Reposted by Taneem

Daniel Oh

@danieloh30.bsky.social

Check out the new episode Technically Speaking w/ Chris Wright - Scaling AI inference with open source ft. Brian Stevens red.ht/4dJiBLc

Technically Speaking | Scaling AI inference with open source

Explore the critical role of production-quality AI inference, the power of open source projects like vLLM, and the future of the enterprise AI stack.

red.ht

June 6, 2025 at 1:10 AM

Taneem

@taneem-ibrahim.bsky.social

FP8-quantized version of Llama 4 Maverick can be downloaded from HuggingFace: huggingface.co/collections/...

Llama 4 - a meta-llama Collection

Llama 4 release

huggingface.co

April 5, 2025 at 8:22 PM

Taneem

@taneem-ibrahim.bsky.social

The official release by Meta includes an FP8-quantized version of Llama 4 Maverick 128E supported by Red Hat’s LLM Compressor library, enabling the 128 expert model to fit on a single NVIDIA 8xH100 node, resulting in more performance with lower costs.

April 5, 2025 at 8:20 PM

Taneem

@taneem-ibrahim.bsky.social

Thanks to the Meta AI team for close collaboration with the vLLM community, enabling developers to experiment with Llama 4 immediately. Our blog shares more details of the Llama 4 release, and how to get started with inferencing in vLLM today: developers.redhat.com/articles/202...