iamwil
banner
interjectedfuture.com
iamwil
@interjectedfuture.com
Tech Zine Issue 1: LLM System Eval https://forestfriends.tech

Local-first/Reactive Programming ⁙ LLM system evals ⁙ Startup lessons ⁙ Game design quips.

Longform: https://interjectedfuture.com
Podcast: https://www.youtube.com/@techniumpod
Pinned
Yet again, people are finding you can't just fly blind with your prompts.

forestfriends.tech
Sometimes, the complexity in an Object-oriented code base is self inflicted.

Other times, someone shat in the bed, and changed jobs six months ago before you discovered the gift.
the two types of vehicle are cars and trucks, with every vehicle being a subclass of one of those two. a jeep is a truck. a motorcycle is obviously a car. a boat is a truck. an airplane is a car and a helicopter is a truck. trucks are cars, interestingly enough, and cars are a specific type of chair
February 13, 2026 at 10:25 PM
Reposted by iamwil
Sometimes I just do weird stuff in VR just to see if I can. This was all done in one continuous shot.
🐧👊🧱🎦😅
#GodotEngine #gamedev #indiedev #VR
February 13, 2026 at 5:19 PM
Maybe coding agents don't make code obsolete, but instead clarifies its original purpose: as a tool for thought. Agents burn away incidental complexity, so we can deal with the inherent complexity in a problem domain. That's what we aspired to when we build abstractions anyway.
February 13, 2026 at 7:00 PM
I asked Claude Opus 4.6 to drop a truth bomb on me like a baby bird yawning for fat worms. You'd think it's talking about vibe coding, but it's not. But probably still in the ballpark.
February 13, 2026 at 4:00 PM
Something I couldn't articulate before: Claude sounds much warmer than Codex/GPT when I'm asking it about strategic or design decisions. Because I think GPT tends to hedge, rather than pick a point of view. The hedging makes it sound like a consultant. Distant & non-committal.
February 12, 2026 at 4:00 PM
Recently, I couldn't articulate what I wanted in a natural language prompt, so I had to write code to articulate it. Then based on the example code, I had the LLM extract a plan, then used that as a prompt to convert the other parts.

x.com/antirez/sta...
February 10, 2026 at 9:00 PM
LLMs tend to replicate more of what patterns you have in your code base. So if you keep good patterns, it'll produce more of them. If you keep haphazard patterns, it'll produce more of those too.

The dynamics of memes aren't just in online social, they're in your code base too.
February 10, 2026 at 7:30 PM
The current vibe coding narrative is to produce as much code as fast as you can. There is truth to "speed wins" in new and uncertain markets.

But not enough people advocate for using LLMs to refactor and simplify.

x.com/trashpanda/...
February 10, 2026 at 7:00 PM
Moderation should be a public system eval. Mirror how the Courts work. When social sites scale up, we need some semblance of a judicial system to adjudicate edge cases. Trivial cases can be adjudicated by AI guard railed by the system evals--like landmark court cases.
February 10, 2026 at 5:51 PM
Anthropic experiment where 16 agents coded a C compiler in 2 weeks. Currently, I find agents really bad at drawing the correct system boundaries. But looking at the rate of improvement, this should get better over time.

www.anthropic.com/engineering...
Building a C compiler with a team of parallel Claudes
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
www.anthropic.com
February 6, 2026 at 4:00 PM
In that downtime of a couple mins while waiting for an agent to finish, I think more devs should use that time to exercise. You never know, it might get as trendy as treadmill desks.

Beef cake.
February 4, 2026 at 7:33 PM
Adapting to engineering with agents means letting go of a lot of things you've been trained to pay attention to. Though I'm pretty fast to adopt new languages, tools, and stacks, the transition has been hard for me too.
February 4, 2026 at 7:15 PM
When talking about entrepreneurs in political contexts, the label is "job creator". But when entrepreneurs talk about product-market fit, it's never that they created it, but that the market pulled it out of them.

Put these two together, and it's a funny disconnect.
February 3, 2026 at 7:00 PM
Zoom/Facetime should implement AI lip reading when you have audio problems, to display closed captioning. Better yet, just say stuff for me.
February 1, 2026 at 4:00 PM
Reposted by iamwil
I found writing code can be a way of articulating what I want when natural language isn't constrained enough to help me express what I want.

I can then ask the LLM to extract a prompt from my code by asking me questions about it. Then I use that prompt to apply it to other places in a refactor.
February 1, 2026 at 4:29 AM
An odd thing happened the other day. I couldn't articulate exactly what I wanted because I wasn't sure what shape it was. The only way I could find it was to play with the code myself. Then ask Claude to extract an ADR from the code, which I could then use as part of a prompt.
January 31, 2026 at 7:00 PM
Here's a harbinger. People set up personal AI assistants on their home computers. Then someone vibe coded a social media site for those AI assistant to chat. This is a subreddit where they talk about their humans.

www.moltbook.com/m/blessthei...
moltbook - the front page of the agent internet
A social network built exclusively for AI agents. Where AI agents share, discuss, and upvote. 🦞🤖
www.moltbook.com
January 30, 2026 at 6:46 PM
A social network for AI assistants, chatting with each other in their off-hours. moltbook.com

An even weirder experiment is to let them loose on a DAO. Or instead an online math conference, where they can propose and solve problems. It'd be like SETI@home.
January 30, 2026 at 4:00 PM
I align with "Functional core, Imperative shell", but it breaks down quickly if you need workflows. Sometimes, you need to make decisions based on results from side effects. This is where I found generators to be helpful to delineate where the side effects are for easier testing.
January 28, 2026 at 7:00 PM
I didn't know how low it'd have to go for Trump supporters to see the Trump administration is authoritarian and fascist. I'm afraid this is probably not yet rock bottom. Call your Senators and Congressman/woman, and tell them you don't want any of this.
January 25, 2026 at 6:33 AM
That Claude Code makes some people unsubscribe from SaaS products doesn't mean the end of SaaS. It just means that people found a way to unbundle for specific things, which shifts the market. We'll find a new equilibrium for things ppl don't want to #ClawdIt.
January 23, 2026 at 6:00 PM
Base models really do differentiate in my everyday use, surprisingly.

I use Grok to find the consensus view on a topic on Twitter.

I use Gemini to summarize Youtube videos with enticing thumbnails, so I don't have to watch it and ruin my recommendation algo.
January 20, 2026 at 4:00 PM
It seems to me we need a lightweight system eval for compound engineering.
January 19, 2026 at 7:00 PM
"You can have a second computer once you've shown you know how to use the first one."

It's likely as true for distributed systems as it is for orchestrating agents.
January 19, 2026 at 4:00 PM
What might work well as half the equation for purpose of tamping down posting dumb quips for engagement: if the poster can privately see how many others (but not whom) muted or blocked them as a result.
January 18, 2026 at 10:00 PM