Andrew Stevens
banner
andrewjstevens.com
Andrew Stevens
@andrewjstevens.com
Co-host of the Hot or Hype podcast, Gaming CTO, and Architect of many things. I like anything cloud, data, and security.
A new post in my Trustworthy AI series, Part 14: Secure Memory Governance.

Agents are storing more state than ever, it’s time to secure the memory layer.

www.sakurasky.com/blog/missing...

#AIEngineering #AgentOps
Trustworthy AI Agents: Secure Memory Governance
Agents increasingly rely on long-term memory, embeddings, caches, and shared state. We need strong security and governance primitives around memory access, retention, isolation, schemas, and poisoning...
www.sakurasky.com
November 27, 2025 at 7:06 AM
I just published another part in my Trustworthy AI Agents series: Distributed Agent Orchestration.

Agents need a real control plane - routing, scheduling, failover, backpressure.

www.sakurasky.com/blog/missing...
Trustworthy AI Agents: Distributed Agent Orchestration
Agents need a control plane. Routing, scheduling, failover, cost-aware prioritization, and cross-agent coordination must be first-class primitives.
www.sakurasky.com
November 26, 2025 at 1:27 PM
Another post in my series on Trustworth AI - Part 12: Resource Governance.

Deep dive into quotas, throttling, priority scheduling, loop detection, and backpressure for multi-agent systems.

www.sakurasky.com/blog/missing...

#AIEngineering #AgentOps #AIGovernance
Trustworthy AI Agents: Resource Governance
Infinite task loops and runaway agents are already common failure modes. We need quota systems, throttling, and prioritization baked in.
www.sakurasky.com
November 25, 2025 at 9:38 AM
Published Part 11 of my Trustworthy AI series.

Deep dive into agent lifecycle management: semantic versioning, immutable builds, CI/CD, safe deprecation, and registry-based governance.

www.sakurasky.com/blog/missing...

#AIEngineering #AgentOps #DevOps #AIGovernance
Trustworthy AI Agents: Agent Lifecycle Management
Like microservices, agents need versioning, deployment pipelines, and safe deprecation paths.
www.sakurasky.com
November 24, 2025 at 12:38 PM
I just published Part 10 in my Trustworthy AI series.

Deep dive into secure multi-agent protocols: identity, signatures, encryption, nonces, schemas, versioning, and formal verification.

www.sakurasky.com/blog/missing...

#AIEngineering #Security #AgentOps
Trustworthy AI Agents: Secure Multi-Agent Protocols
Agents need a standardized, authenticated, encrypted, and versioned protocol for inter-agent communication. Right now it is wild-west JSON over HTTP, which is unsafe for autonomous systems.
www.sakurasky.com
November 23, 2025 at 8:53 PM
Published Part 9 of the Trustworthy AI series.

Deep dive into formal verification for agents: invariants, state models, SMT solvers, and counterexample-driven replay.

Python examples included.

www.sakurasky.com/blog/missing...

#AIEngineering #AIDebugging #AIGovernance
Formal Verification of Constraints
Agents that act autonomously must obey provable invariants. Formal verification provides the missing guardrails for constraints like 'never transmit unencrypted PII' or 'never exceed credit exposure t...
www.sakurasky.com
November 21, 2025 at 2:23 PM
I just published Part 8 in my Trustworthy AI series.

Deterministic replay for agent systems: trace capture, replay stubs, clock virtualization, and reproducible debugging.

www.sakurasky.com/blog/missing...

#AIEngineering #AIDebugging #LLMSystems #AgentOps #Observability
Trustworthy AI Agents: Deterministic Replay
Debugging agents is nearly impossible today. We need the ability to record and replay runs deterministically to diagnose errors and failures.
www.sakurasky.com
November 20, 2025 at 11:09 AM
Part 7 of my Trustworthy AI series is out.

I take a look at adversarial robustness for agent systems: sanitization, anomaly detection, context stripping, probe detection, and adversarial testing. Python examples included.

www.sakurasky.com/blog/missing...

#AIGovernance #AIEngineering #AgentOps
Trustworthy AI Agents: Adversarial Robustness
Models need to withstand data poisoning, prompt injection, and inversion attacks. A cleverly crafted input can collapse your system. This section covers the missing primitives that defend against adve...
www.sakurasky.com
November 19, 2025 at 1:34 PM
Published Part 6 of the Trustworthy AI series today.

Deep dive into kill switches, circuit breakers, and runtime safety for autonomous agents, with example Python walk throughs.

Read: www.sakurasky.com/blog/missing...

#AIGovernance #AIEngineering #CloudSecurity #AgentOps #DevSecOps
Trustworthy AI Agents: Kill Switches and Circuit Breakers
Why autonomous agents need hard limits, circuit breakers, and emergency stop mechanisms to prevent runaway execution and cascading failures.
www.sakurasky.com
November 18, 2025 at 8:17 AM
Dropped a new post in the Trustworthy AI series today.

Deep dive on verifiable audit logs for agent systems: hash chains, Merkle trees, SPIFFE-backed signatures, and AWS anchoring. Practical and code heavy.

www.sakurasky.com/blog/missing...
Verifiable Audit Logs
How to make every agent action tamper proof and cryptographically verifiable for compliance and forensic analysis.
www.sakurasky.com
November 17, 2025 at 11:40 AM
New post in my "Missing Primitives for Trustworthy AI Agents" series: Policy-as-Code for AI agents.

If agents are making decisions at runtime, the guardrails have to live there too.

OPA, Rego, SPIFFE, and a Python example.

www.sakurasky.com/blog/missing...
Policy-as-Code Enforcement
Guardrails must be enforced at runtime, not left as developer best practices. Just like infrastructure-as-code, compliance must be baked into execution.
www.sakurasky.com
November 16, 2025 at 11:21 AM
Google’s new whitepaper “Introduction to Agents and Agent Architectures” (Nov 2025) - from LLMs that generate outputs to agents that achieve outcomes.

Agents = model + tools + orchestration.

www.kaggle.com/whitepaper-i...

#AI #Agents #LLM #MLOps #AIEngineering
Introduction to Agents
www.kaggle.com
November 12, 2025 at 9:50 PM
Context drift: how models break when a problem looks the same but isn’t.

New research shows LLMs often “remember” logic puzzles instead of re-reasoning them.

Change a few names or numbers, and performance collapses but confidence stays high.

🔗 arxiv.org/abs/2510.11812
PHANTOM RECALL: When Familiar Puzzles Fool Smart Models
Large language models (LLMs) such as GPT, Gemini, and Claude often appear adept at solving classic logic puzzles--but how much genuine reasoning underlies their answers? Recent evidence suggests that ...
arxiv.org
October 16, 2025 at 9:22 AM
A shift in AI: from systems that generate outputs to systems that model reality.

World models learn from video, sensors & robot data to understand space, time, & cause. The “physics” of the real world.

Robotics that predict reactions, games with real physics, and digital twins that reason.
October 13, 2025 at 12:05 PM
Can WebAssembly replace containers at the edge?

A new paper benchmarks Wasm vs containers across the Edge–Cloud Continuum. Gains in cold starts & image size, but major I/O & latency trade-offs.

Read here arxiv.org/abs/2510.05118

#WebAssembly #EdgeComputing #Serverless #CloudNative
Lumos: Performance Characterization of WebAssembly as a Serverless Runtime in the Edge-Cloud Continuum
WebAssembly has emerged as a lightweight and portable runtime to execute serverless functions, particularly in heterogeneous and resource-constrained environments such as the Edge Cloud Continuum. How...
arxiv.org
October 8, 2025 at 9:18 AM
Reposted by Andrew Stevens
How do you trust an autonomous AI agent?

In our latest post, we look at workload identity as another missing primitive for trustworthy AI.

Read more on our blog: www.sakurasky.com/blog/missing...

#AI #AISecurity #SPIFFE #WorkloadIdentity #DevSecOps
Agent Identity & Attestation
Go beyond API keys. Learn to engineer trustworthy AI agents with verifiable identity and attestation using the SPIFFE framework and a Python example.
www.sakurasky.com
October 7, 2025 at 8:01 AM
"Grit" doesn't build a lasting tech services company. Deliberate structure does.

The choices matter:

Reusable IP > Individual heroes
Deep specialization > Chasing low rates
A balanced client portfolio > Relying on one huge account

These are what separate a true partner from a temporary vendor.
October 1, 2025 at 9:33 AM
Your AI moat isn't the model. It's the data.

But a data moat requires serious engineering:
* Reliable Pipelines
* Clear Lineage
* Automated Quality Gates
* Strong Security

Without these, your proprietary data is a liability, not a defensible asset. Moats are built, not found.

#AI #DataEngineering
September 30, 2025 at 12:48 PM
Breakthroughs excite investors. Smart innovation sustains organisations.

The hardest call in tech leadership? Knowing when to push a bold idea vs. double down on iteration.

Big wins need both.

#TechLeadership #Innovation #Cloud #Data #Security
September 23, 2025 at 8:53 AM
Technical debt always gets paid. The only question is when, and who pays it.

Shortcuts show up as:
* Slower velocity
* Security risk
* Talent drain

Treat debt pay-down like security: non-negotiable, budgeted, and strategic.

The speed of next year depends on the cleanup you invest in today.
September 22, 2025 at 3:09 PM
Reposted by Andrew Stevens
Are your AI agents actually secure?

In this instalment of our blog series on Trustworthy AI, we explain why true End-to-End Encryption (E2EE) is non-negotiable and provide a hands-on Python example to fix it.

www.sakurasky.com/blog/missing...
End-to-End Encryption (Part 1)
Part 0 of a 13-part series on trustworthy AI agents—an overview of 12 missing engineering primitives (encryption, identity, guardrails, audit, governance) required for production at scale.
www.sakurasky.com
September 19, 2025 at 9:52 AM
Your ping-pong table isn't culture.

For tech teams real culture is a system built on psychological safety, a clear mission, and accountability.

It’s not a soft skill - it’s a core requirement for building reliable and secure systems.

#TechCulture #Leadership
September 18, 2025 at 9:31 AM
A new paper on hallucination detection has a clever idea: probe all LLM layers at once, not just one (Cross-Layer Attention Probing).

Absolutely worth reading: arxiv.org/pdf/2509.09700

#AI #AIGovernance #LLM
arxiv.org
September 17, 2025 at 9:47 AM
This paper has a pattern for making LLMs reliable for structured data extraction: wrap the model with a domain ontology to define the rules and an automated correction loop to enforce them.
The study is tiny (only 50 test logs) but the architectural pattern is the takeaway

arxiv.org/pdf/2509.00081
arxiv.org
September 10, 2025 at 8:28 AM
Shadow AI is the new shadow IT.

Teams are spinning up LLMs + pipelines outside governance.
The risks? Data leakage, privacy violations, compliance failures.
The challenge? People can build AI faster than you can regulate it.

#AI #Privacy #Compliance
September 9, 2025 at 5:11 AM