Andrew Yourtchenko
banner
ayourtch.bsky.social
Andrew Yourtchenko
@ayourtch.bsky.social
LLM/agents experimenting, Rust, 3D-printing and active mobility. Release manager for VPP. Bits of code: GitHub.com/ayourtch ; all posts are entirely only mine. 🇪🇺
Pinned
It’s almost Xmas, the time when wishes may come true, they say.

So: 🚀 Introducing: the Kubernetes Wish System: github.com/ayourtch/k8s...
GitHub - ayourtch/k8s-wish-system: The last k8s add-on that you will ever need. (WARNING: use at your own risk! PoC/experiment repo)
The last k8s add-on that you will ever need. (WARNING: use at your own risk! PoC/experiment repo) - ayourtch/k8s-wish-system
github.com
Reposted by Andrew Yourtchenko
I will never get tired of my favorite traffic-calming device.
September 13, 2024 at 9:29 PM
I had Claude Opus 4.6 write a short story. It was kinda fun to read, so thought to drop it here and have the internets rip it apart. :) stdio.be/blog/2026-02...
Index of /blog/2026-02-10-the-punchline/
stdio.be
February 10, 2026 at 10:47 AM
Reposted by Andrew Yourtchenko
T-shirt for cyclists with, written on the back:

I wouldn’t be in your way
If I had an unobstructed bike lane
To ride in
February 9, 2026 at 1:53 PM
Reposted by Andrew Yourtchenko
it is already possible, right now, to get 100tok/sec out of 100 watts of air-cooled local inference, and the output is surprisingly good

LLM architecture gains will compound against improved silicon over the next five years to produce serious leverage, WHICH WORKERS CAN OWN OUTRIGHT
If you're a person who thinks "billionaires are evil -> billionaires own big tech -> frontier labs dominate the LLM space and are owned by billionaires -> LLMs is evil" then I am sympathetic to it. However the correct prescription is not to destroy LLM technology but to expropriate it.
Marx isn't just mildly disagreeing with the Luddites, he's saying their machine-breaking actively gave reactionary governments a pretext for repression. It was strategically counterproductive on top of being analytically wrong.
February 8, 2026 at 5:03 PM
Reposted by Andrew Yourtchenko
We should think outside of the chat box when designing AI-assisted software development workflows:

haskellforall.com/2026/02/beyo...
Beyond agentic coding
AI dev tooling can do better than chat interfaces
haskellforall.com
February 7, 2026 at 5:26 PM
"Are we there yet?"
I’m starting to think the people who are excited about “AI agents” have literally never used a computer in their lives
February 8, 2026 at 1:13 PM
So, I did a first run - everything seems to work. And the results are... entertaining.
The graph of pass rate of tasks done by LLM based on the difficulty and the prompting style. On simple tasks - bare directive wins, on the more complex, the polite directive wins, and on the most complex...
February 8, 2026 at 9:09 AM
Reposted by Andrew Yourtchenko
The cost of turning written business logic into code has dropped to zero. At best, near-zero.

The cost of integrating services and libraries, the plumbing of the code world, has dropped to zero. At best, near-zero.

What does that mean for the future?

New blog post: brooker.co.za/blog/2026/02...
You Are Here - Marc's Blog
brooker.co.za
February 7, 2026 at 6:36 PM
I entirely agree with this.
I got a lot of shit for this last time around, but I stand by it: making oneself into the kind of person who kicks roombas as an anger displacement technique is not good for the soul
felt the need. i feel vastly under qualified to write something like this, but i also feel its especially important that we think about the way we use language
February 7, 2026 at 10:46 AM
Sigh. And so roughly 2 hours later I have 20KLOC of code to deal with.
Allright, let’s get Claude on it then! :) github.com/ayourtch-llm...
github.com
February 7, 2026 at 10:04 AM
huggingface.co/nvidia/Nemot... - gonna get the ape to take it for a spin. If it is as efficient at tool calling in my use case, then already having the access to “llm_oneshot” tool that can escalate to a heavier model can actually work.
nvidia/Nemotron-Orchestrator-8B · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
February 7, 2026 at 7:32 AM
By the way: $ podman run --network=host --env-file .env.apchat -it ghcr.io/ayourtch-llm/apchat:latest --interactive --stream --model 'glm-47-flash@openai(https-url-here)' with “OPENAI_API_KEY” defined in .env.apchat - and you can have your own ape too!
February 7, 2026 at 12:24 AM
My little ape (powered by GLM-4.6) today wrote a terminal based rig to chat to open webui via a convoluted playwright+chrome contraption, for the purposes of red teaming of a thingy behind this open webui. The bit it got stuck on ? Python packaging in the install script.
February 7, 2026 at 12:17 AM
Finally today got to package my little ape as a docker container. And getting the semblance of the persistent input area with status… things are slowly taking shape.
February 6, 2026 at 12:13 AM
After running Qwen-Coder-Next overnight and letting the ape autonomously make a few self-improvement / clean-up commits - I think I am a fan. It gives certainly better throughput than GLM-4.7 quant, and by virtue of being the full model, does not exhibit the “quant stupidity”.
February 4, 2026 at 8:05 AM
Ok, so I would say this is officially not bad at all.
February 3, 2026 at 9:04 PM
Watching the ape navigate its own codebase using Coder-Next is mesmerizing, after I asked it to fix a relatively tricky problem… it is still working on it but seems to be making progress.
Wired the Ape to Qwen3-Coder-Next to continue a multi-ticket feature work instead of the quantized GLM-4.7, let’s see… first impression is it is a bit faste. Now, as for the quality - will see later what it makes :-)
February 3, 2026 at 8:56 PM
Wired the Ape to Qwen3-Coder-Next to continue a multi-ticket feature work instead of the quantized GLM-4.7, let’s see… first impression is it is a bit faste. Now, as for the quality - will see later what it makes :-)
February 3, 2026 at 7:32 PM
Reposted by Andrew Yourtchenko
ByteDance Seed's ConceptMoE: moving beyond uniform token-level processing to adaptive concept-level computation in LLMs!

Why waste equal compute on trivially predictable tokens? When you can merge similar tokens into concepts while preserving fine-grained processing for complex content.
February 2, 2026 at 2:09 PM
Tried github.com/eugr/spark-v... - very very nice, finally was able to run GLM-4.7 - the only catch was to tweak the host auto detection algorithm, because I use the link-layer local addresses, and scanning a /16 space in python is very slow..
February 2, 2026 at 7:45 AM
Fosdem done, next up: CfgMgmtCamp in Ghent, this time not hiding in a cave :)
February 1, 2026 at 10:32 PM
Reposted by Andrew Yourtchenko
As always, objection to UBI is entirely due to the emotional needs of a very small number of people.
"Not only don't people work less when they are guaranteed an income, they might actually put in more effort at work. And the fact that they have more money to spend leads to the creation of more jobs."

Nobel Prize–winning economists Abhijit Banerjee and Esther Duflo
January 30, 2026 at 9:47 AM
Pretty much accidentally made a little screencast app - stdio.be/cast/
Simple Screencast Recorder
stdio.be
January 29, 2026 at 4:22 PM
Reposted by Andrew Yourtchenko
so mysterious and strange how fewer cars means happier (and alive) people
“They cut speed limits, changed street design, removed space for cars… Now it appears that work is paying off. #Oslo & #Helsinki are reaping the rewards of committed action on making their roads safer, reducing pedestrian fatalities to zero last year.” #VisionZero #SpeedKills
How Helsinki and Oslo cut pedestrian deaths to zero
After years of committed action, neither city recorded a single pedestrian fatality in 2019
www.theguardian.com
January 28, 2026 at 10:58 AM