Lightnews — Scholar-powered news

Ajeet Singh Raina

@ajeetraina.bsky.social

👣 Follow me for Docker 🐳 Kubernetes, Cloud-Native, LLM and GenAI stuffs | Developer Advocate at Docker | @Collabnix | Distinguished Arm Ambassador

Posts Replies Media Videos

Ajeet Singh Raina

@ajeetraina.bsky.social

During this active 5-minute window, llama.cpp's KV cache functionality is fully operational, reusing cached prompt tokens automatically across requests with similar prefixes.

October 23, 2025 at 4:42 PM

Ajeet Singh Raina

@ajeetraina.bsky.social

Docker Model Runner uses llama.cpp as the inference engine running as a native host process, loading the requested model on demand and performing inference on received requests Docker. Models are loaded into memory on demand and unloaded after 5 minutes of inactivity.

October 23, 2025 at 4:42 PM

Ajeet Singh Raina

@ajeetraina.bsky.social

Yes, Token caching through llama.cpp's KV cache works automatically in Docker Model Runner - no configuration needed!

October 23, 2025 at 4:41 PM

Ajeet Singh Raina

@ajeetraina.bsky.social

Here’s the complete write-up collabnix.com/running-dock...

Running Docker Desktop on NVIDIA Jetson Orin Nano Super for the first time – Collabnix

I’ve been eyeing the NVIDIA Jetson lineup for ages, and when the Orin Nano Super was released, I knew I had to get my hands on one. After weeks of hunting—and honestly, some desperate emails to NVIDIA...

collabnix.com

April 4, 2025 at 3:13 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news