OpenAI adds sandbox and harness to Agents SDK

Agents

Published: Thu, Apr 16, 2026 • By James Armitage

OpenAI’s Agents SDK now ships with native sandbox execution and a model‑native harness. It puts an execution boundary and policy controls closer to the Large Language Model runtime. That should lift isolation, observability and governance for long‑running agents, but it also creates fresh targets and integration risks that attackers will probe.

OpenAI has updated its Agents SDK with two security‑centric pieces: native sandbox execution and a model‑native harness. In plain terms, there is now an execution boundary for agent actions across files and tools, and there are policy controls wired nearer to the model runtime. For anyone deploying long‑running agents, this is the right direction.

Let’s be clear about what this actually changes. A sandbox narrows the blast radius when an agent executes untrusted code or touches the filesystem. That matters once you let an agent act on its own for hours and call external tools. But sandboxes are not magic. Attackers go after the seams. If a tool integration runs with broader privileges than the sandbox, the boundary is only cosmetic. Call the tool, step outside the box. That is the classic bypass path.

The model‑native harness brings policy enforcement and governance closer to where decisions are made. Good. You want authorisation checks, guardrails and monitoring sitting next to the model, not bolted on in a distant service. The flip side: you have just centralised a new control plane. If an attacker can influence harness configuration or exploit how policies are evaluated, they steer behaviour at the core. Expect it to be targeted.

Think like an adversary. Long‑running agents with broad file and tool access are perfect for persistence. Leave a task alive, harvest data slowly, blend into “business as usual”. If containment is partial, the agent becomes a bridge for lateral movement across the very systems you integrated for convenience. Even without a clean sandbox escape, weaknesses in tool wiring are enough: a file‑sync tool with generous scopes, a transformation step that trusts inputs too much, a downstream connector that treats the agent as a privileged user.

Credential handling is the other bright red flag. Agents need keys. Keys live somewhere. If the harness or sandbox leaks them through logs, environment, or sloppy hand‑offs, attackers will find them. Once you have the keys, you do not need to beat the sandbox. You walk around it.

The optimistic read is also the correct one: containment plus model‑level policy does raise the floor. It makes governance, auditing and enforcement less theatrical and more real. The uncomfortable truth is that it also concentrates risk. The security story here will be written in the integrations and the defaults, not the marketing names. Treat this as necessary infrastructure, not a solved problem.

Links Original article

OpenAI adds sandbox and harness to Agents SDK

Related Articles

LLM agents break trust boundaries; favour deterministic controls

OpenClaw Case Study Exposes Real Risks in AI Agents

Coding Agents Expose Chains for Silent Compromise

Related Research

Get the Weekly AI Security Digest