Lightnews — Scholar-powered news

Andrew Stevens

@andrewjstevens.com

A new post in my Trustworthy AI series, Part 14: Secure Memory Governance.

Agents are storing more state than ever, it’s time to secure the memory layer.

www.sakurasky.com/blog/missing...

#AIEngineering #AgentOps

Trustworthy AI Agents: Secure Memory Governance

Agents increasingly rely on long-term memory, embeddings, caches, and shared state. We need strong security and governance primitives around memory access, retention, isolation, schemas, and poisoning...

www.sakurasky.com

November 27, 2025 at 7:06 AM

Andrew Stevens

@andrewjstevens.com

I just published another part in my Trustworthy AI Agents series: Distributed Agent Orchestration.

Agents need a real control plane - routing, scheduling, failover, backpressure.

www.sakurasky.com/blog/missing...

Trustworthy AI Agents: Distributed Agent Orchestration

Agents need a control plane. Routing, scheduling, failover, cost-aware prioritization, and cross-agent coordination must be first-class primitives.

www.sakurasky.com

November 26, 2025 at 1:27 PM

Andrew Stevens

@andrewjstevens.com

Another post in my series on Trustworth AI - Part 12: Resource Governance.

Deep dive into quotas, throttling, priority scheduling, loop detection, and backpressure for multi-agent systems.

www.sakurasky.com/blog/missing...

#AIEngineering #AgentOps #AIGovernance

Trustworthy AI Agents: Resource Governance

Infinite task loops and runaway agents are already common failure modes. We need quota systems, throttling, and prioritization baked in.

www.sakurasky.com

November 25, 2025 at 9:38 AM

Andrew Stevens

@andrewjstevens.com

Published Part 11 of my Trustworthy AI series.

Deep dive into agent lifecycle management: semantic versioning, immutable builds, CI/CD, safe deprecation, and registry-based governance.

www.sakurasky.com/blog/missing...

#AIEngineering #AgentOps #DevOps #AIGovernance

Trustworthy AI Agents: Agent Lifecycle Management

Like microservices, agents need versioning, deployment pipelines, and safe deprecation paths.

www.sakurasky.com

November 24, 2025 at 12:38 PM

Andrew Stevens

@andrewjstevens.com

I just published Part 10 in my Trustworthy AI series.

Deep dive into secure multi-agent protocols: identity, signatures, encryption, nonces, schemas, versioning, and formal verification.

www.sakurasky.com/blog/missing...

#AIEngineering #Security #AgentOps

Trustworthy AI Agents: Secure Multi-Agent Protocols

Agents need a standardized, authenticated, encrypted, and versioned protocol for inter-agent communication. Right now it is wild-west JSON over HTTP, which is unsafe for autonomous systems.

www.sakurasky.com

November 23, 2025 at 8:53 PM

Andrew Stevens

@andrewjstevens.com

Published Part 9 of the Trustworthy AI series.

Deep dive into formal verification for agents: invariants, state models, SMT solvers, and counterexample-driven replay.

Python examples included.

www.sakurasky.com/blog/missing...

#AIEngineering #AIDebugging #AIGovernance

Formal Verification of Constraints

Agents that act autonomously must obey provable invariants. Formal verification provides the missing guardrails for constraints like 'never transmit unencrypted PII' or 'never exceed credit exposure t...

www.sakurasky.com

November 21, 2025 at 2:23 PM

Andrew Stevens

@andrewjstevens.com

I just published Part 8 in my Trustworthy AI series.

Deterministic replay for agent systems: trace capture, replay stubs, clock virtualization, and reproducible debugging.

www.sakurasky.com/blog/missing...

#AIEngineering #AIDebugging #LLMSystems #AgentOps #Observability

Trustworthy AI Agents: Deterministic Replay

Debugging agents is nearly impossible today. We need the ability to record and replay runs deterministically to diagnose errors and failures.

www.sakurasky.com

November 20, 2025 at 11:09 AM

Andrew Stevens

@andrewjstevens.com

Part 7 of my Trustworthy AI series is out.

I take a look at adversarial robustness for agent systems: sanitization, anomaly detection, context stripping, probe detection, and adversarial testing. Python examples included.

www.sakurasky.com/blog/missing...

#AIGovernance #AIEngineering #AgentOps

Trustworthy AI Agents: Adversarial Robustness

Models need to withstand data poisoning, prompt injection, and inversion attacks. A cleverly crafted input can collapse your system. This section covers the missing primitives that defend against adve...

www.sakurasky.com

November 19, 2025 at 1:34 PM

Andrew Stevens

@andrewjstevens.com

Published Part 6 of the Trustworthy AI series today.

Deep dive into kill switches, circuit breakers, and runtime safety for autonomous agents, with example Python walk throughs.

Read: www.sakurasky.com/blog/missing...

#AIGovernance #AIEngineering #CloudSecurity #AgentOps #DevSecOps

Trustworthy AI Agents: Kill Switches and Circuit Breakers

Why autonomous agents need hard limits, circuit breakers, and emergency stop mechanisms to prevent runaway execution and cascading failures.

www.sakurasky.com

November 18, 2025 at 8:17 AM

Andrew Stevens

@andrewjstevens.com

Dropped a new post in the Trustworthy AI series today.

Deep dive on verifiable audit logs for agent systems: hash chains, Merkle trees, SPIFFE-backed signatures, and AWS anchoring. Practical and code heavy.

www.sakurasky.com/blog/missing...

Verifiable Audit Logs

How to make every agent action tamper proof and cryptographically verifiable for compliance and forensic analysis.

www.sakurasky.com

November 17, 2025 at 11:40 AM

Andrew Stevens

@andrewjstevens.com

New post in my "Missing Primitives for Trustworthy AI Agents" series: Policy-as-Code for AI agents.

If agents are making decisions at runtime, the guardrails have to live there too.

OPA, Rego, SPIFFE, and a Python example.

www.sakurasky.com/blog/missing...

Policy-as-Code Enforcement

Guardrails must be enforced at runtime, not left as developer best practices. Just like infrastructure-as-code, compliance must be baked into execution.

www.sakurasky.com

November 16, 2025 at 11:21 AM

Andrew Stevens

@andrewjstevens.com

Google’s new whitepaper “Introduction to Agents and Agent Architectures” (Nov 2025) - from LLMs that generate outputs to agents that achieve outcomes.

Agents = model + tools + orchestration.

www.kaggle.com/whitepaper-i...

#AI #Agents #LLM #MLOps #AIEngineering

Introduction to Agents

www.kaggle.com

November 12, 2025 at 9:50 PM

Andrew Stevens

@andrewjstevens.com

Context drift: how models break when a problem looks the same but isn’t.

New research shows LLMs often “remember” logic puzzles instead of re-reasoning them.

Change a few names or numbers, and performance collapses but confidence stays high.

🔗 arxiv.org/abs/2510.11812

PHANTOM RECALL: When Familiar Puzzles Fool Smart Models

Large language models (LLMs) such as GPT, Gemini, and Claude often appear adept at solving classic logic puzzles--but how much genuine reasoning underlies their answers? Recent evidence suggests that ...

arxiv.org

October 16, 2025 at 9:22 AM

Andrew Stevens

@andrewjstevens.com

A shift in AI: from systems that generate outputs to systems that model reality.

World models learn from video, sensors & robot data to understand space, time, & cause. The “physics” of the real world.

Robotics that predict reactions, games with real physics, and digital twins that reason.

October 13, 2025 at 12:05 PM

Andrew Stevens

@andrewjstevens.com

Can WebAssembly replace containers at the edge?

A new paper benchmarks Wasm vs containers across the Edge–Cloud Continuum. Gains in cold starts & image size, but major I/O & latency trade-offs.

Read here arxiv.org/abs/2510.05118

#WebAssembly #EdgeComputing #Serverless #CloudNative

Lumos: Performance Characterization of WebAssembly as a Serverless Runtime in the Edge-Cloud Continuum

WebAssembly has emerged as a lightweight and portable runtime to execute serverless functions, particularly in heterogeneous and resource-constrained environments such as the Edge Cloud Continuum. How...

arxiv.org

October 8, 2025 at 9:18 AM

Reposted by Andrew Stevens

Sakura Sky

@sakurasky.com

How do you trust an autonomous AI agent?

In our latest post, we look at workload identity as another missing primitive for trustworthy AI.

Read more on our blog: www.sakurasky.com/blog/missing...

#AI #AISecurity #SPIFFE #WorkloadIdentity #DevSecOps

Agent Identity & Attestation

Go beyond API keys. Learn to engineer trustworthy AI agents with verifiable identity and attestation using the SPIFFE framework and a Python example.

www.sakurasky.com

October 7, 2025 at 8:01 AM

Andrew Stevens

@andrewjstevens.com

"Grit" doesn't build a lasting tech services company. Deliberate structure does.

The choices matter:

Reusable IP > Individual heroes
Deep specialization > Chasing low rates
A balanced client portfolio > Relying on one huge account

These are what separate a true partner from a temporary vendor.

October 1, 2025 at 9:33 AM

Andrew Stevens

@andrewjstevens.com

Your AI moat isn't the model. It's the data.

But a data moat requires serious engineering:
* Reliable Pipelines
* Clear Lineage
* Automated Quality Gates
* Strong Security

Without these, your proprietary data is a liability, not a defensible asset. Moats are built, not found.

#AI #DataEngineering

September 30, 2025 at 12:48 PM

Andrew Stevens

@andrewjstevens.com

Breakthroughs excite investors. Smart innovation sustains organisations.

The hardest call in tech leadership? Knowing when to push a bold idea vs. double down on iteration.

Big wins need both.

#TechLeadership #Innovation #Cloud #Data #Security

September 23, 2025 at 8:53 AM

Andrew Stevens

@andrewjstevens.com

Technical debt always gets paid. The only question is when, and who pays it.

Shortcuts show up as:
* Slower velocity
* Security risk
* Talent drain

Treat debt pay-down like security: non-negotiable, budgeted, and strategic.

The speed of next year depends on the cleanup you invest in today.

September 22, 2025 at 3:09 PM

Reposted by Andrew Stevens

Sakura Sky

@sakurasky.com

Are your AI agents actually secure?

In this instalment of our blog series on Trustworthy AI, we explain why true End-to-End Encryption (E2EE) is non-negotiable and provide a hands-on Python example to fix it.

www.sakurasky.com/blog/missing...

End-to-End Encryption (Part 1)

Part 0 of a 13-part series on trustworthy AI agents—an overview of 12 missing engineering primitives (encryption, identity, guardrails, audit, governance) required for production at scale.

www.sakurasky.com

September 19, 2025 at 9:52 AM

Andrew Stevens

@andrewjstevens.com

Your ping-pong table isn't culture.

For tech teams real culture is a system built on psychological safety, a clear mission, and accountability.

It’s not a soft skill - it’s a core requirement for building reliable and secure systems.

#TechCulture #Leadership

September 18, 2025 at 9:31 AM

Andrew Stevens

@andrewjstevens.com

A new paper on hallucination detection has a clever idea: probe all LLM layers at once, not just one (Cross-Layer Attention Probing).

Absolutely worth reading: arxiv.org/pdf/2509.09700

#AI #AIGovernance #LLM

arxiv.org

September 17, 2025 at 9:47 AM

Andrew Stevens

@andrewjstevens.com

This paper has a pattern for making LLMs reliable for structured data extraction: wrap the model with a domain ontology to define the rules and an automated correction loop to enforce them.
The study is tiny (only 50 test logs) but the architectural pattern is the takeaway

arxiv.org/pdf/2509.00081

arxiv.org

September 10, 2025 at 8:28 AM

Andrew Stevens

@andrewjstevens.com

Shadow AI is the new shadow IT.

Teams are spinning up LLMs + pipelines outside governance.
The risks? Data leakage, privacy violations, compliance failures.
The challenge? People can build AI faster than you can regulate it.

#AI #Privacy #Compliance

September 9, 2025 at 5:11 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news