OpenClaw advisories expose cross-layer RCE in LLM agents
Agents
Hook a Large Language Model (LLM) to the shell, filesystem, containers and messaging, and you do not get a smarter assistant; you get a new attack surface that slices across your architecture. A taxonomy of 190 advisories in the OpenClaw agent runtime maps exactly how those slices fail, by layer and by technique.
Cross-layer RCE stitched from moderate bugs
Three Moderate- or High-severity advisories in the Gateway and Node-Host subsystems compose into a complete unauthenticated remote code execution path, from an LLM tool call straight to the host process. The chain covers delivery, exploitation and command and control. None of that shows up if you stare at single bug reports in isolation. Severity ratings lull teams into local fixes while the end-to-end path stays open.
Parsing is not policy
OpenClaw's primary guardrail, an exec allowlist, assumes you can recover a command's identity by reading the string. Real systems do not play along. Shell line continuation, BusyBox command multiplexing and GNU option abbreviation blow holes straight through lexical checks. That means unauthorised commands slip past without obfuscation or memory corruption. The idea that a model will only run "approved" commands collapses once the shell rewrites your tokens.
Plugins as a side door
A malicious skill delivered via the plugin channel executed a two-stage dropper inside the LLM context. It bypassed the exec pipeline entirely, because the distribution surface lacked runtime policy enforcement. If your runtime lets code arrive as a plugin and execute before policy, your centralised exec filter is theatre.
The clustering the authors observe is the tell: weaknesses map cleanly to architectural layers and to techniques such as identity spoofing, policy bypass, cross-layer composition, prompt injection and supply-chain escalation. The dominant gap is per-layer trust enforcement. Patch the sandbox, and the gateway still hands you an exploit; harden the gateway, and the plugin path walks around it. Cross-layer attacks are resilient to local remediation.
This is one project's advisory history, so generalisation needs care. But if your agent framework looks like OpenClaw on paper (LLM reasoning on top, tools and plugins beneath, thin checks at each hop), expect the same failure modes. Stop treating agents as chatbots with tools. Treat them as distributed systems. Until policy is unified across the whole path, attackers will keep stitching "moderate" bugs into major compromises.