Xuan Son Nguyen
banner
ngxson.hf.co
Xuan Son Nguyen
@ngxson.hf.co
Software Engineer @ Hugging Face 🤗
Very nice touch, Gmail 😅
October 5, 2025 at 9:11 PM
Part 2 of my journey building a smart home! 🚀

In this part:
> ESPHome & custom component
> RF433 receiver & transmitter
> Hassio custom addon
August 29, 2025 at 1:33 PM
Just published a new article on my blog 🏃‍♂️

Building My Smart Home - Part 1: Plan, Idea & Home Assistant

Check it out!
August 27, 2025 at 7:09 PM
Kudos to Google and the llama.cpp team! 🤝

GGUF support for Gemma 270M right from day-0
August 14, 2025 at 4:49 PM
Richy Mini and SmolLM3 are featured in Github's weekly news! 🚀 🚀
July 21, 2025 at 3:53 PM
Gemma 3n has arrived in llama.cpp 👨‍🍳 🍰

Comes in 2 flavors: E2B and E4B (E means "effective/active parameters")
June 26, 2025 at 6:46 PM
See you this Sunday at AI Plumbers conference: 2nd edition!

📍 Where: GLS Event Campus Berlin, Kastanienallee 82 | 10435 Berlin
👉 Register here: lu.ma/vqx423ct
June 11, 2025 at 9:09 AM
✨✨ AIFoundry is bringing you the AI Plumbers Conference: 2nd edition — an open source meetup for low-level AI builders to dive deep into "the plumbing" of modern AI

📍 Where: GLS Event Campus Berlin, Kastanienallee 82 | 10435 Berlin
📅 When: June 15, 2025
👉 Register now: lu.ma/vqx423ct
June 3, 2025 at 12:19 PM
Hugging Face Inference Endpoints now officially support deploying **vision** models via llama.cpp 👀 👀

Try it now: endpoints.huggingface.co/catalog
May 15, 2025 at 2:43 PM
Real-time webcam demo with @huggingface.bsky.social SmolVLM and llama.cpp server.

All running locally on a Macbook M3
May 12, 2025 at 5:27 PM
Although we have A100, H200, M3 Ultra, etc

Still can't match the power of that Casio FX 😆
April 25, 2025 at 1:01 PM
llama.cpp vision support just got much better! 🚀

Traditionally, models with complicated chat template like MiniCPM-V or Gemma 3 requires a dedicated binary to run.

Now, you can use all supported models via a "llama-mtmd-cli" 🔥

(Only Qwen2VL is not yet supported)
April 21, 2025 at 1:46 PM
Finally have time to write a blog post about ggml-easy! 😂

ggml-easy is a header-only wrapper for GGML, simplifies development with a cleaner API, easy debugging utilities, and native safetensors loading ✨ Great for rapid prototyping!
April 20, 2025 at 11:27 PM
Someone at Google definitely had a lot of fun making this 😆

And if you don't know, it's available in "Starter apps" section on AI Studio. The app is called "Gemini 95"
April 20, 2025 at 10:40 PM
Telling LLM memory requirement WITHOUT a calculator?

Just use your good old human brain 🧠 😎

Check out my 3‑step estimation 🚀
April 20, 2025 at 11:00 AM
Google having a quite good sense of humor 😂

Joke aside, 1B model quantized to Q4 without performance degrading is sweet 🤏
April 19, 2025 at 5:00 PM
Cooking a fun thing today, I can now load safetensors file directly to GGML without having to convert it to GGUF!

Why? Because this allow me to do experiments faster, especially with models outside of llama.cpp 😆
March 31, 2025 at 3:25 PM
No vibe coding. Just code it ✅

Visit my website --> ngxson.com
March 30, 2025 at 8:01 PM
On Monday, the 24th, I'm proud to give a talk at sota's webinar.

My main talk will last for an hour to deep dive into the current state of on-device LLMs, exploring their advantages, trade-offs, and limitations.

The session will end with an Q&A, where you can ask me anything about this subject.
March 20, 2025 at 1:36 PM
Had a fantastic chat today with Georgi Gerganov, the brilliant mind behind ggml, llama.cpp, and whisper.cpp! We discussed about:

🚀 The integration of vision models into llama.cpp
🚀 The challenges of maintaining a smooth UX/DX
🚀 The exciting future of llama.cpp

Big things ahead - stay tuned!
March 19, 2025 at 2:53 PM
OK now you are the best, Gememe 2.0
March 13, 2025 at 11:23 AM
Wanna try Gemma 3 vision with llama.cpp?

There is a playground for that! More in 🧵
March 12, 2025 at 10:05 AM
Day-zero Gemma 3 support in llama.cpp 🤯

👉 4 model sizes: 1B, 4B, 12B, 27B
👉 Vision capability (except for 1B) with bi-direction attention
👉 Context size: 32k (1B) and 128k (4B, 12B, 27B)
👉 +140 languages support (except for 1B)
👉 Day-zero support on many frameworks 🚀
March 12, 2025 at 8:31 AM
Aya Vision is now the number one trending OCR model on Hugging Face 🚀

👉 Comes in 2 sizes, 8B and 32B
👉 Supports 32 languages
👉 Day-zero support with HF Transformers
March 10, 2025 at 11:05 AM
Did you know? A number of 🤗 Hugging Face's blog posts now feature AI-created podcasts 🎙️

This offers an alternative way to absorb extensive and intricate articles 🔍
March 8, 2025 at 6:00 PM