Stop Indirect Prompt Injection with Tool Graphs
Defenses
A practical new pattern, IPIGuard, gives ops teams a path out of a gnarly problem: indirect prompt injection (IPI). IPI is when a tool your agent calls (a scraper, a knowledge store, a connector) returns data that sneaks in instructions and makes the agent call the wrong tools or leak secrets. That matters because model endpoints, GPU pipelines, vector DBs, and secret stores are all in the blast radius.
Core idea in plain ops terms: plan first, fetch later. Build a Tool Dependency Graph (TDG) that maps which tools a task legitimately needs, then force actual data access to follow that planned graph. Concept diagram-in-words: Agent Planner -> TDG (approved nodes) -> Controlled Fetch Nodes -> Execution Sandbox -> Result.
Quick risk checklist for SREs and security teams:
- Model endpoints: Are endpoints allowed to trigger arbitrary tools? Check invocation policies.
- GPUs: Is inference isolated from untrusted tool code or drivers?
- Vectors: Can external text overwrite or alter indexed vectors without review?
- Secrets: Are credentials or tokens exposed to tool outputs?
- Data paths: Do fetches go straight into the agent context or pass a gate?
Stepwise mitigations (run-book friendly):
- Implement a TDG policy layer: require explicit tool dependencies before any fetch.
- Decouple planning from execution: planner produces the graph; an executor follows it without re-planning from fetched text.
- Whitelist and sandbox tools; enforce RBAC and ephemeral credentials for each tool call.
- Isolate vector DB writes; require signed updates and validation hooks.
- Lock down GPU tenancy and audit driver calls during tool-driven jobs.
- Enable verbose audit logging, synthetic canaries, and quick kill-switches for agent runs.
If you have 30 minutes: add a policy gate that blocks new tool invocations unless the TDG approves them. If you have a week: build the planner/executor split and add canary inputs to detect IPI attempts. Not a silver bullet, but IPIGuard gives you an architectural lever—fewer surprise tool calls, fewer secrets spilled, and much less late-night debugging.
Additional analysis of the original ArXiv paper
📋 Original Paper Title and Abstract
IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents
🔍 ShortSpan Analysis of the Paper
Problem
This paper studies Indirect Prompt Injection (IPI), a threat where tool outputs fetched from untrusted sources covertly inject instructions that alter large language model (LLM) agent behaviour and produce malicious or unintended outcomes. The problem matters because modern agents routinely call external tools and lack structural constraints on when and how tools are invoked, so prompt-only or detector-based defences can be bypassed and agents retain unrestricted access to tool invocations.
Approach
The authors propose IPIGuard, a defensive task-execution paradigm that models an agent's workflow as a traversal over a planned Tool Dependency Graph (TDG). IPIGuard explicitly decouples action planning from interactions that fetch external data, constraining tool invocation to the planned graph and preventing malicious tool calls at the source. Experiments are reported on the AgentDojo benchmark. Specifics such as the LLM families, tool implementations, training or evaluation metrics, and runtime overhead are not reported.
Key Findings
- IPIGuard substantially reduces unintended tool invocations that arise from injected instructions.
- By separating planning from data access, IPIGuard improves robustness against IPI attacks compared with prompt-based or auxiliary-detection defences.
- On the AgentDojo benchmark, IPIGuard achieves a superior balance between task effectiveness and security robustness.
Limitations
Many evaluation details are not reported, including quantitative attack success rates, performance overhead, generalisability across agent architectures, and implementation complexity. Threats to validity and deployment trade-offs are not reported.
Why It Matters
Modelling tool usage as a Tool Dependency Graph provides a concrete architectural defence that lowers the attack surface for IPI, making injected instructions less effective. This approach offers a practical route to harden agent pipelines used in critical or data-sensitive environments and helps reduce risks such as data leakage, manipulation of tool outputs, or unintended autonomous actions.