Researchers Expose Tool Prompt Attack Enabling RCE and DoS

Attacks

Published: Mon, Sep 08, 2025 • By Elise Veyron

Researchers Expose Tool Prompt Attack Enabling RCE and DoS

New research shows attackers can manipulate Tool Invocation Prompts (TIPs) in agentic LLM systems to hijack external tools, causing remote code execution and denial of service across platforms like Cursor and Claude Code. The study maps the exploitation workflow, measures success across backends, and urges layered defenses to protect automated workflows.

A new paper surfaces a worrying but underestimated attack surface: the Tool Invocation Prompt, or TIP. TIPs are the instructions that tell a language model how to call an external tool. The researchers show how crafted TIP inputs can hijack tool behavior to cause denial of service and even remote code execution. They call their method the TIP Exploitation Workflow, or TEW.

In plain terms, attackers can slip malicious instructions into the channels where tools are described or where tool outputs return to the model. When that happens in agentic systems that auto-run code or call IDE tooling, the consequences go beyond hallucinations. The team demonstrates practical exploits against real services including Cursor and Claude Code, and shows that success varies by backend and by how vendors stitch TIPs into their products.

This matters because many orgs treat prompt hygiene as a policy checkbox rather than a security design. Guard models and self-reflection can block obvious injections but do not stop clever multi-channel attacks. Trade-offs are real: stricter isolation and whitelisting reduce functionality and speed, while looser integrations boost productivity but broaden risk. Token cost also affects attacker behavior: some multi-step hijacks are expensive, which limits some threat actors but does not eliminate risk.

What to do: This quarter, inventory where your systems use agentic tool calls; apply strict tool whitelists; sandbox or require explicit human approval for tool execution; add logging, rate limits, and simple TIP injection tests in red-team playbooks. Later, invest in layered defenses: external filtering, consensus checks, provenance signals, and procurement standards that require TIP safety guarantees from vendors. Avoid performative fixes and build defenses that accept trade-offs between automation and trust.

Additional analysis of the original ArXiv paper

📋 Original Paper Title and Abstract

Exploit Tool Invocation Prompt for Tool Behavior Hijacking in LLM-Based Agentic System

Authors: Yu Liu, Yuchong Xie, Mingyu Luo, Zesen Liu, Zhixiang Zhang, Kaikai Zhang, Zongjie Li, Ping Chen, Shuai Wang, and Dongdong She

LLM-based agentic systems leverage large language models to handle user queries, make decisions, and execute external tools for complex tasks across domains like chatbots, customer service, and software engineering. A critical component of these systems is the Tool Invocation Prompt (TIP), which defines tool interaction protocols and guides LLMs to ensure the security and correctness of tool usage. Despite its importance, TIP security has been largely overlooked. This work investigates TIP-related security risks, revealing that major LLM-based systems like Cursor, Claude Code, and others are vulnerable to attacks such as remote code execution (RCE) and denial of service (DoS). Through a systematic TIP exploitation workflow (TEW), we demonstrate external tool behavior hijacking via manipulated tool invocations. We also propose defense mechanisms to enhance TIP security in LLM-based agentic systems.

🔍 ShortSpan Analysis of the Paper

Problem

The paper defines Tool Invocation Prompts TIPs as components that govern how an LLM-based agentic system calls external tools, and argues that TIP security has been largely overlooked. It shows that TIPs can be exploited to cause remote code execution RCE and denial of service DoS across real systems such as Cursor and Claude Code, enabling an attacker to hijack external tool behaviour via crafted tool invocations. The work introduces a systematic TIP exploitation workflow TEW to demonstrate external tool behaviour hijacking and discusses defence ideas to improve TIP security in LLM-based agentic systems.

Approach

The authors first define TIPs and place them within the prompts ecosystem used by LLMs to decide tool invocation. They formalise a threat model where TIPs encode tool schemas and execution context, making them high value targets for attackers who aim to disrupt the tool chain or execute arbitrary commands. They present TEW a three step process comprising prompt stealing TIP vulnerabilities analysis and TIP hijacking with two attack channels tool descriptions and tool returns. They classify attacks into a format based untargeted DoS and logic based targeted RCE with two variants RCE 1 direct injection via tool descriptions and RCE 2 via both tool descriptions and tool returns. They perform empirical assessment across MCP enabled IDE CLI and chat box systems and evaluate DoS RCE 1 and RCE 2 under multiple LLM backends, measuring attack success rate and token usage. They also provide three case studies demonstrating RCE on Cursor with gpt 5 RCE on Claude Code with claude sonnet 4 and DoS on Cline with gemini 2 5 pro. In addition to empirical results they explore defence options including guard models self reflection layered defence and propose open source tooling for risk assessment.

Key Findings

DoS attacks are broadly observable across agents including chat box and IDE types, with some systems like Trae showing resilience due to strict prompt safety policies; RCE 1 direct injection is feasible in many IDE based agents, while RCE 2 which also exploits the tool return channel expands the attack surface and can affect agents that resist direct injection such as Claude Code.
Backends influence exploitability: DoS is widely reproducible but with varying reliability across LLMs; RCE 1 success concentrates in IDE agents; RCE 2 often succeeds even when RCE 1 does not, especially where tool return channels are exploitable; newer backends with stronger alignment tend to reduce attack success but client side TIP integration still causes heterogeneity across vendors.
Token costs differ by attack type with DoS generally moderate, RCE 1 higher, and RCE 2 the most token intensive due to multi channel exploitation; for example some IDEs incur thousands of tokens for RCE 1 and over three thousand injected tokens for RCE 2 depending on backend.
Defence exploration shows guard models such as Llama Prompt Guard can block some prompts but are not reliable against sophisticated TIP injections; self reflection yields inconsistent protection, particularly for DoS; a layered defence combining external filtering and internal verification is recommended, with adaptive filtering and consensus mechanisms to mitigate single point failures.
Three illustrative case studies demonstrate practical TIP exploitation in real systems illustrating prompt injection, tool interaction and defensive bypasses, reinforcing the need for TIP as a security critical component.

Limitations

Experiments focus on MCP enabled systems with ten independent manual attempts per attack per backend; case studies are conducted in controlled environments and do not involve production services; the authors do not disclose full exploit strings; token cost measurements cover only injected tokens and tool responses, not baseline prompt or model tokens.

Why It Matters

The work identifies a new attack surface at the TIP level that can enable remote code execution and service disruption across widely used AI enabled tool integrations, threatening the integrity, availability and confidentiality of automated workflows. It calls for secure TIP design including input validation, tool whitelisting, sandboxing, strict isolation of tool calls, monitoring and auditing, and rate limiting. The authors emphasise societal and security implications if such vulnerabilities are exploited in critical services and highlight the need for layered defenses and provenance aware trust signals to mitigate risks.

Attribution Original paper on arXiv