Forrester Terry
banner
sweetpapa.bsky.social
Forrester Terry
@sweetpapa.bsky.social
Software Engineering Manager • I build AI apps at Stanford, make lo-fi music, and occasionally sleep. Engineering manager who learned to code the hard way. Thoughts on AI agents, dev tools, and vibes.
What was learned: "Test with actual humans, not just yourself." Now every feature goes through structured UAT before production, a phased rollout if possible, and we explicitly document edge cases that broke during testing.
January 29, 2026 at 2:49 PM
Failure Lesson: App had high pass rates in automated tests. Testing with humans, they found edge cases that broke app - ambiguous references, unusual phrasing, scenarios we never considered.

Worst case avoided: Deploying to 17,000 users with bugs that would've been caught in proper human testing.
January 29, 2026 at 2:49 PM
When the issue actually IS acceptable, I usually mention it as an FYI or limitation so users understand. Or if reasonable, make it a true "feature" and not bug and lean into it.
January 29, 2026 at 2:49 PM
Another notable one - command injection risk: AI tricked into running "rm -rf /" ?

Really unacceptable.

Solution: Tools only run whitelisted, pre-written, safe commands. AI can't pass arbitrary strings to command line. LLM data gets sanitized/validated. No exceptions.
January 29, 2026 at 2:49 PM
The guardrail became the feature - instead of preventing hallucinations with prompts (doesn't always work well), I made it impossible for hallucinated content to reach players.
January 29, 2026 at 2:49 PM
In my mystery game, worst case was LLM NPCs hallucinating characters that didn't exist. Unacceptable - very game-breaking.

Solution: Entity extraction → validation pipeline.

Before any NPC mentions a person/place/event, system checks "does this exist in canon?" If not, rejected & regenerated.
January 29, 2026 at 2:49 PM
Haha good point, the hype of A.I. earns the skepticism. My experience in software has been more "double edged power tool that took time to learn" than "novelty pen I abandoned" - it really depends on the task, the approach, and the person. Mileage definitely varies. It's coming a long way though.
January 22, 2026 at 7:49 PM
The pattern that works:
✅ You decide the approach
✅ You validate the output
✅ You catch context-specific stuff

For now, we are still the architects 🏗️

#AIcoding #DevTips #AIagents
January 22, 2026 at 5:51 AM
The real danger isn't multi-agent systems - it's scaling weak implementations or letting AI generate massive outputs without checkpoints.

Supervisor agents help, but they're not a substitute for proper engineering and QA. ✅
January 17, 2026 at 8:35 PM
Know your error tolerance for each component. Mission-critical paths need flawless execution. Nice-to-haves can fail gracefully. Design accordingly - backup flows, fallbacks, clear failure modes. ⚖️
January 17, 2026 at 8:35 PM
Test with actual humans, not just yourself. You'll be shocked what edge cases others find that break your system. That "obviously it works" confidence is how bugs reach production. 🧪
January 17, 2026 at 8:35 PM
Now extend that to multi-agent AI systems: talented interns managing other talented interns, all occasionally hallucinating with confidence.

Supervisor agents help, but they're not always the safety net we may think they are.
January 17, 2026 at 8:32 PM
I'm dropping a more in depth post with charts/data next week on Dev.to. Stay tuned or follow me if you want the link when it drops. 🚀
January 17, 2026 at 6:42 AM
Don't just be a prompt engineer. That's a dead end. Become a Force Multiplier. Learn: Zod/Pydantic (Structure), Architecture (The Big Picture), and Verification (The BS Detector). The future isn't 'AI replaces you.' It's 'You + AI Agents' replacing 'You alone.'
January 17, 2026 at 6:42 AM
METR found devs took 19% LONGER using AI for complex tasks. Why? Because fixing bad AI code takes more skill than writing it from scratch. If you don't understand the foundations, you can't be the orchestrator. You're just a passenger.
January 17, 2026 at 6:42 AM